<?xml version="1.0" encoding="utf-8"?>



<feed xmlns="http://www.w3.org/2005/Atom"
    xmlns:fh="http://purl.org/syndication/history/1.0"
    xmlns:at="http://purl.org/atompub/tombstones/1.0">

    <title>Publ: Development Blog</title>
    <subtitle>A personal publishing system for the modern web</subtitle>
    <link href="http://publ.beesbuzz.biz/blog/feed?tag=tools" rel="self" />
    <link href="http://publ.beesbuzz.biz/blog/feed" rel="current" />
    <link href="https://busybee.superfeedr.com" rel="hub" />
    
    <link href="http://publ.beesbuzz.biz/blog/feed?date=2018-11" rel="prev-archive" />
    
    
    <link href="http://publ.beesbuzz.biz/blog/" />
    <fh:archive />
    <id>tag:publ.beesbuzz.biz,2020-01-07:blog</id>
    <updated>2019-04-28T02:39:54-07:00</updated>

    
    <entry>
        <title>Reblob!</title>
        <link href="http://publ.beesbuzz.biz/blog/179-Reblob" rel="alternate" type="text/html" />
        <published>2019-04-28T02:39:54-07:00</published>
        <updated>2019-04-28T02:39:54-07:00</updated>
        <id>urn:uuid:3f70af57-df12-58e2-b054-42671984153f</id>
        <author><name>fluffy</name></author>
        <content type="html">
<![CDATA[
<p>It&rsquo;s been a while since I&rsquo;ve worked on IndieWeb stuff, but I finally got around to releasing an <em>extremely preliminary</em> version of <a href="http://publ.beesbuzz.biz/tools/1423-reblob">reblob</a>, a little commandline thingus to make this stuff easier. Eventually I&rsquo;ll also have a server-based version here, at least as an example.</p>

]]>
        </content>
    </entry>
    
    <entry>
        <title>Pushl v0.2.4, now with a proper user-agent</title>
        <link href="http://publ.beesbuzz.biz/blog/1088-Pushl-v0.2.4-now-with-a-proper-user-agent" rel="alternate" type="text/html" />
        <published>2019-03-15T17:29:27-07:00</published>
        <updated>2019-03-15T17:29:27-07:00</updated>
        <id>urn:uuid:c049a5e2-0598-5c41-b961-6f16cbfe5ab6</id>
        <author><name>fluffy</name></author>
        <content type="html">
<![CDATA[
<p>While trying to figure out some weird access patterns on the day-job site I had the realization Pushl wasn&rsquo;t actually specifying a user-agent, so it was just coming through as the generic <code>aiohttp</code> one, which isn&rsquo;t very friendly.</p><p>Now it sends a reasonable user-agent by default, and this can be overridden by the <code>--user-agent</code> flag if you want to for your own analytics or whatever.</p><p>Oh, and I had quietly released 0.2.3 a few days ago; there were just some minor internal changes to logging and also declaring Pushl as beta, rather than alpha, software.</p>

]]>
        </content>
    </entry>
    
    <entry>
        <title>Pushl 0.2.2</title>
        <link href="http://publ.beesbuzz.biz/blog/756-Pushl-0.2.2" rel="alternate" type="text/html" />
        <published>2019-03-10T18:25:58-07:00</published>
        <updated>2019-03-10T18:25:58-07:00</updated>
        <id>urn:uuid:d9b55a70-8000-5092-be97-2775d4a24cba</id>
        <author><name>fluffy</name></author>
        <content type="html">
<![CDATA[
<p>I&rsquo;ve done a bunch more work on Pushl to try to get it more stable. In particular, I&rsquo;ve made it so that it will only recurse into feeds that are on domains that were declared in the initial requests, and I seem to have cleared up some cases which were causing it to hang and also added a global timeout which will, hopefully, prevent it from hanging indefinitely.</p><p>I do wish I could figure out what is causing the hangs when they do happen though. Oh well. Some discussion of the issue below the cut.</p>

<p>So, there are two main tasks, <a href="https://github.com/PlaidWeb/Pushl/blob/01b1d438382bd5c06851626d3dadcd6e3d8cb3f3/pushl/__init__.py#L31"><code>process_feed</code></a> and <a href="https://github.com/PlaidWeb/Pushl/blob/01b1d438382bd5c06851626d3dadcd6e3d8cb3f3/pushl/__init__.py#L96"><code>process_entry</code></a>, which can both be spawned by the command line processor, and which can also spawn each other. (<code>process_feed</code> generally spawns <code>process_entry</code> as a matter of course, <code>process_entry</code> only spawns <code>process_feed</code> if <code>-r</code> is set.)</p><p>Both of these tasks will asynchronously fetch the data for the item itself, but then will gather a list of additional tasks to start in parallel, such as sending off WebSub/WebMention notifications or the aforementioned additional feed and entry processing tasks. And, because of the way <code>asyncio</code> works, the last thing each task does is wait for its pending tasks to complete.</p><p>The thing is, the <em>only</em> thing that <em>ever</em> hangs is that pending wait!</p><p>I&rsquo;ve added a lot of logging to everything to see where every part of every process begins and ends, in a way that I can match things up in pairs, and every single individual task completes. But that <code>await asyncio.wait(pending)</code> will sometimes just wait forever. If I inspect the list of pending tasks when this does happen, every one is in the <code>done</code> state, so <code>asyncio.wait</code> should just be returning for them. But they aren&rsquo;t.</p><p>It&rsquo;s not even deterministic, which means that there&rsquo;s probably something timing-related. Which would make me worry about there being a deadlock, but&hellip; there&rsquo;s nowhere that a deadlock could sneak in, either. Any time a task is fired off it&rsquo;s done as a new instance (except for the specific case of getting a webmention endpoint, which is cached using <code>async_lru</code> but doesn&rsquo;t have any dependencies on anything that has a pending list, and isn&rsquo;t a thing that&rsquo;s hanging anyway), any duplicated work is discarded before any <code>await</code> statement (so there&rsquo;s no way any cyclic dependencies are happening), all local file access is non-asynchronous, and like, when it does hang, the usual pattern is that there will be 2-3 <code>process_feed</code> tasks waiting on 6-7 <code>process_entry</code> tasks, which have all completed all of their async work but are waiting on <em>their</em> pending tasks.</p><p>I&rsquo;m sure there&rsquo;s just some dang typo somewhere that is causing something weird to happen, although <code>pylint</code> and <code>flake8</code> haven&rsquo;t found any of the usual telltale signs of that.</p><p>But of course, now that I&rsquo;ve written a blog entry about trying to diagnose the problem, I can&rsquo;t get the problem to recur, even on things that used to reproduce it 100%. <strong>WHATEVER.</strong></p>

]]>
        </content>
    </entry>
    
    <entry>
        <title>Pushl v0.2.1 released</title>
        <link href="http://publ.beesbuzz.biz/blog/446-Pushl-v0.2.1-released" rel="alternate" type="text/html" />
        <published>2019-03-07T22:27:02-08:00</published>
        <updated>2019-03-07T22:27:02-08:00</updated>
        <id>urn:uuid:3530a187-4a9c-55a0-89a2-76ef058ea10b</id>
        <author><name>fluffy</name></author>
        <content type="html">
<![CDATA[
<p>I&rsquo;ve been working on getting Pushl much more stable and reliable, particularly around a persistent &ldquo;too many open files&rdquo; error I was having, which turned out to be primarily due to a fd leak in the caching routines. Oops.</p><p>Anyway, there&rsquo;s also seemingly a problem with how <code>aiohttp</code> manages its connection pool, at least on macOS, so I&rsquo;ve disabled connection keep-alive by default. However, if you still want to use keep-alive, there&rsquo;s now a <code>--keepalive</code> option to allow you to do that. I&rsquo;m finding that it doesn&rsquo;t really improve performance all that much anyway.</p><p>This is feeling beta-ready but I&rsquo;ll give it a few days for other issues to shake out first.</p>

]]>
        </content>
    </entry>
    
    <entry>
        <title>Pushl v0.2.0 released</title>
        <link href="http://publ.beesbuzz.biz/blog/894-Pushl-v0.2.0-released" rel="alternate" type="text/html" />
        <published>2019-03-07T00:05:24-08:00</published>
        <updated>2019-03-07T00:05:24-08:00</updated>
        <id>urn:uuid:105de13e-a3ea-5565-bdc9-7bcd63543249</id>
        <author><name>fluffy</name></author>
        <content type="html">
<![CDATA[
<p>So, I just released v0.2.0 of <a href="https://github.com/PlaidWeb/Pushl">Pushl</a>. It was a <a href="https://github.com/PlaidWeb/Pushl/compare/v0.1.8..v0.2.0">pretty big change</a>, in that I pretty much rewrote all the networking stuff, and fixed some pretty ridiculous bugs with the caching implementation as well.</p><p>The main thing is now it&rsquo;s using async I/O instead of thread-per-connection, so it&rsquo;s way more efficient and also times out correctly.</p><p>And oh gosh, I had so many tiny but critical errors in the way caching was implemented &ndash; no <em>wonder</em> it kept on acting as if there was no cached state. Yeesh.</p><p>Anyway, I&rsquo;ll let this run on my site for a few days and if I like what I see I&rsquo;ll upgrade it to beta status on PyPI.</p>

]]>
        </content>
    </entry>
    
    <entry>
        <title>An early-alpha Movable Type importer</title>
        <link href="http://publ.beesbuzz.biz/blog/999-An-early-alpha-Movable-Type-importer" rel="alternate" type="text/html" />
        <published>2019-02-20T15:42:18-08:00</published>
        <updated>2019-02-20T15:42:18-08:00</updated>
        <id>urn:uuid:11c6297c-2dc9-54f6-a781-ec43d0ed1d00</id>
        <author><name>fluffy</name></author>
        <content type="html">
<![CDATA[
<p>For those folks who want to import their content from Movable Type over to Publ, I&rsquo;ve finally gotten around to writing an <a href="https://github.com/PlaidWeb/mt2publ">importer</a>. Currently it only attempts to convert entry content and category metadata, and only using SQLite-formatted database dumps.</p><p>See its <code>README.md</code> for the (incredibly rough) usage instructions.</p><p>Eventually I want to try to automatically convert templates from MT&rsquo;s scripting language to Jinja-Publ templates, although there&rsquo;s a bunch of stuff that&rsquo;s going to be difficult to port across and a lot of stuff is just plain not feasible to even try, so don&rsquo;t expect that to become a major thing any time soon.</p>

]]>
        </content>
    </entry>
    
    <entry>
        <title>Pushl v0.1.7</title>
        <link href="http://publ.beesbuzz.biz/blog/838-Pushl-v0.1.7" rel="alternate" type="text/html" />
        <published>2019-01-14T21:28:44-08:00</published>
        <updated>2019-01-14T21:28:44-08:00</updated>
        <id>urn:uuid:f79f8d1c-06ea-5472-8bdd-752bf26e2d73</id>
        <author><name>fluffy</name></author>
        <content type="html">
<![CDATA[
<p>I ended up doing some more work on Pushl and have now released v0.1.7. The major changes:</p>
<ul>
<li>Did a bunch of refactoring to make the code a little cleaner and handle configuration more appropriately</li>
<li>Added a configurable timeout for connections (which now defaults to 15)</li>
<li>Added a <code>--version</code> option on the command line arguments</li>
</ul>
<p>Also, some suggested usage ideas below the cut!</p>

<h3 id="838_h3_1_Installation"><a href="http://publ.beesbuzz.biz/blog/838-Pushl-v0.1.7#838_h3_1_Installation"></a>Installation</h3><p>An installation guide is available in the <a href="https://github.com/PlaidWeb/Pushl/blob/master/README.md">project README</a>, but the short version is to make sure you have <a href="http://python.org">Python 3</a> available and then run the following at a command prompt:</p><figure class="blockcode"><pre class="highlight" data-language="bash" data-line-numbers><span class="line" id="e838cb1L1"><a class="line-number" href="http://publ.beesbuzz.biz/blog/838-Pushl-v0.1.7#e838cb1L1"></a><span class="line-content">pip3<span class="w"> </span>install<span class="w"> </span>pushl</span></span>
</pre></figure><p>which should do everything you need to install it. (On Linux or macOS may need to do <code>sudo pip3 install pushl</code> depending on how your system is set up.)</p><h3 id="838_h3_2_Some-usage-ideas"><a href="http://publ.beesbuzz.biz/blog/838-Pushl-v0.1.7#838_h3_2_Some-usage-ideas"></a>Some usage ideas</h3><p>The main use for Pushl is to send Webmention and Pingbacks from any arbitrary blog to link targets, regardless of blogging platform (for example, using Jekyll, Movable Type, Pelican, or, of course, Publ). But it can be used for a lot more than that!</p><p>For example, the <code>-e</code>/<code>--entry</code> flag can be used to send webmentions from a specific page; for example:</p><figure class="blockcode"><pre class="highlight" data-language="bash" data-line-numbers><span class="line" id="e838cb2L1"><a class="line-number" href="http://publ.beesbuzz.biz/blog/838-Pushl-v0.1.7#e838cb2L1"></a><span class="line-content">pushl<span class="w"> </span>-e<span class="w"> </span>http://example.com/blog/page/12345</span></span>
</pre></figure><p>And if this page embeds feed discovery tags, you can combine that with <code>-r</code> to also recursively apply to its feeds; for example:</p><figure class="blockcode"><pre class="highlight" data-language="bash" data-line-numbers><span class="line" id="e838cb3L1"><a class="line-number" href="http://publ.beesbuzz.biz/blog/838-Pushl-v0.1.7#e838cb3L1"></a><span class="line-content">pushl<span class="w"> </span>-re<span class="w"> </span>http://forum.example.com/</span></span>
</pre></figure><p>This works especially well with forum software such as phpBB and XenForo, both of which support feed discovery. And this will help website publishers to know when their content is being discussed, with forum posts appearing as &ldquo;pingbacks&rdquo; on their site!</p><p>Of course, when using it with a forum or a sporadically-updating blog or whatever you&rsquo;ll probably want it to be in a cron job. There&rsquo;s more information about how to set that up in <a href="https://github.com/PlaidWeb/Pushl/blob/master/README.md">the project README</a>.</p>

]]>
        </content>
    </entry>
    
    <entry>
        <title>Pushl v0.1.6 released</title>
        <link href="http://publ.beesbuzz.biz/blog/1318-Pushl-v0.1.6-released" rel="alternate" type="text/html" />
        <published>2019-01-13T20:48:35-08:00</published>
        <updated>2019-01-13T20:48:35-08:00</updated>
        <id>urn:uuid:a944ce4f-8cda-5398-9f7c-155efe760373</id>
        <author><name>fluffy</name></author>
        <content type="html">
<![CDATA[
<p>It&rsquo;s been a while since I&rsquo;ve updated <a href="http://pypi.org/project/Pushl">Pushl</a> but today I released v0.1.6. It includes the following fixes:</p>
<ul>
<li>Now it supports Pingback as well as Webmention</li>
<li>Improved the threading defaults and connection pooling</li>
<li>Also checks entries for updates even if the feed didn&rsquo;t change (in case something changed in the more text or page metadata or whatever)</li>
</ul>
<p>Anyway, it should just be a <code>pip install --upgrade pushl</code> (or <code>pipenv update</code>) away.</p>

]]>
        </content>
    </entry>
    
    <entry>
        <title>Pushl v0.1.5</title>
        <link href="http://publ.beesbuzz.biz/blog/1171-Pushl-v0.1.5" rel="alternate" type="text/html" />
        <published>2018-12-22T01:35:02-08:00</published>
        <updated>2018-12-22T01:35:02-08:00</updated>
        <id>urn:uuid:6f29f27f-d18b-5f91-a1bf-a0ff6679c11a</id>
        <author><name>fluffy</name></author>
        <content type="html">
<![CDATA[
<p>While I&rsquo;m fixing random stuff in Publ, I figured I&rsquo;d finally fix some problems with Pushl too. Nothing major here, just:</p>
<ul>
<li>Stability: Fixed a bug where feeds that don&rsquo;t declare links caused the worker to die before entries got processed</li>
<li>Performance: Now we use a global connection pool (so connections can be reused)</li>
<li>Fixed a <a href="https://github.com/PlaidWeb/Pushl/issues/9">minor correctness issue</a> with archive feeds (which actually doesn&rsquo;t make any difference in the real world but whatever)</li>
</ul>


]]>
        </content>
    </entry>
    
    <entry>
        <title>Embedding webmention.io pings on your site</title>
        <link href="http://publ.beesbuzz.biz/blog/1048-Embedding-webmention.io-pings-on-your-site" rel="alternate" type="text/html" />
        <published>2018-12-20T23:14:47-08:00</published>
        <updated>2018-12-20T23:14:47-08:00</updated>
        <id>urn:uuid:66ca748a-4597-5381-a0c9-fcf41f7fb75e</id>
        <author><name>fluffy</name></author>
        <content type="html">
<![CDATA[
<p>Are you using <a href="https://webmention.io">webmention.io</a> as your webmention endpoint? Want to get your incoming webmentions displayed on your website?</p><p>Well you&rsquo;re in luck, I wrote <a href="http://publ.beesbuzz.biz/static/webmention.js">a simple-ish script for that</a>. (You&rsquo;ll probably also want to see <a href="http://publ.beesbuzz.biz/static/webmentions.css">the accompanying stylesheet</a> too.) And it doesn&rsquo;t even require that you use Publ &ndash; it should work with any CMS, static or dynamic. The only requirement is that you use either webmention.io or something that has a similar enough retrieval API.</p><p>I wrote more about it on <a href="https://beesbuzz.biz/blog/3743-More-fun-with-Webmentions">my blog</a>, where you can also see it in use. For now, I&rsquo;m just going to use the <a href="http://publ.beesbuzz.biz/github-site">sample site repository</a> to manage it (and issues against it).</p><p>It&rsquo;s MIT-licensed, so feel free to use it wherever and however you want and to modify it for your needs. I might improve it down the road but for now it&rsquo;s mostly just a quick itch-scratching hack that does things the way I want it to.</p>

]]>
        </content>
    </entry>
    

    
</feed>