For a project I've been working on I needed a simple spider which would, given a start URL, recursively collect all the URLs it could find.
In the past I've used the excellent PhantomJS headless webkit browser for automation, but writing complex navigation scenarios can be a bit long-winded. Enter CasperJS. Built on top of PhantomJS, it simplifies the process and provides some nice syntactic sugar to boot.
The spider I wrote grabs the first page, finds all of the links, then by pushing each URL onto a stack and shifting new URLs from the bottom, follows each link in the order in which it was found. Going recursive is key; the casper.open() method doesn't block, so without recursion there would be trouble.
The following code is the core spider, which should be easy to adapt for most purposes:
I use Git for managing all of my code and documentation. To give me the flexibility I need, I've always hosted my own Git server using Gitosis, which was less than fun to work with and has now been deprecated within the Git community. I'm setting up a new server, so I've been looking for other solutions. The most commonly recommended replacement is Gitolite, which looks like a great solution. If you don't need fine-grained access control, however, there are even easier solutions.
Scott Chacon's excellent Pro Git book mentions several ways to set up a git server using plain Git. One of the more flexible options is to set up an SSH user for Git and add the SSH public key of each user to the Git user's authorized_keys file. Anyone who's public key is in the list can access the hosted Git repositories. It's remarkably easy to set up a Git server in this manner.
I've put together an ancient Egyptian hieroglyphic alphabet reference table which can be used when learning the Egyptian language, or just for fun if you're curious about how Egyptian was written. The table shows each glyph with a description of each sign, along with information about the sign's transliteration values.
As an extra bonus, I've created a small order-the-alphabet game at the bottom of the page which will help you to memorise the order of the alphabet as you will find it in most Egyptological resources (dictionaries, book indices, etc). If you're planning on using a hard-copy Egyptian dictionary, knowing the order in which the words are listed can save a lot of time!
To play the game, just drag around the hieroglyphs until you think you've got them in the right order, then hit the check order button to see whether you were right! There's no penalty if you get the order wrong. In fact, I encourage you to check as you go to make sure you're on the right track. 1,000 bonus points* for getting the hieroglyphic alphabet in the right order on the first try! Good luck!
* Bonus points may be hypothetical and existence can't be guaranteed outside of your own mind.
A few months back I set out to create some online hieroglyphic flashcards to help me in my studies of the ancient Egyptian language. Today I am happy to announce that all of the data has been entered and the flashcards are ready for use!
I started work on the cards after I invested in some blank playing cards with which to create physical flashcards to help me remember the common hieroglyphs found in the written Egyptian language. It was taking a while, and my drawing skills are terrible; after finishing a few cards I had a thought: "Why not put the effort into creating some online flashcards?". So I did, and they are now online to use as you wish!
Information for the flashcards is based on James P. Allen's Middle Egyptian: An Introduction to the Language and Culture of Hieroglyphs and was entered entirely by hand, then cross-referenced with lists entered by others with the help of a few small Python scripts to make sure that there were no mistakes. Where alternative signs exist they have been used, so you will see several variants of the uniliteral š, for example.
There are 3 keyboards to choose from, British, European and Manuel de Codage. Transliterations can be entered using your computer keyboard in MDC format to save time, and score is kept to show your progress in each section.
Feedback is most welcome. At the moment signs appear randomly, with the exception that you will never get the same flashcard twice in a row; I'm working on a training algorithm to present signs you have trouble with more often. I'm also working on some full Egyptian word flashcards based on Mark Vygus' amazing 17,000+ word dictionary, as well as a fully searchable online Egyptian dictionary based on the same data, so keep your eyes open!
My Raspberry Pi arrived on Tuesday! It's now Thursday and I haven't had much time to play around with it yet, but I have installed the Debian image and checked that everything is in working order.
The thought that my Pi may be DOA crossed my mind shortly after it arrived when I installed Debian onto the SD card, plugged everything in, and nothing seemed to happen. The power light was active, but the "OK" light remained dark. After doing some troubleshooting I found out that the 8GB SanDisk Class 10 SD card I was using is known to be a problem SD card and won't work with the Pi due to a bug in the Broadcom bootloader. Luckily, the SD card in my camera was compatible, so I swapped them over and installed Debian. This time both the power and "OK" lights lit up, along with the network lights a few seconds later.
The problem I now faced is that I had no HDMI to DVI cable so I couldn't actually see anything. Debian uses DHCP to get an IP on boot so I tried a quick network scan with nmap, but SSH wasn't open on the Pi; it turns out it's disabled by default for security reasons. Probably a good idea given the default username and password and keenness to connect to the network. I plugged in my old Happy Hacking keyboard and took a stab at blindly enabling SSH, which worked. For anyone in the same position, enter pi followed by raspberry to log in, then sudo /etc/init.d/ssh start to start the SSH server.
I recently had a discussion with a client of mine who mentioned that they manually upload backups of their website to Google Docs, that they had always wished that there was a way to FTP them up to save time. That's an interesting idea, said I, I just happen to have written a Google Docs API interaction class in PHP which can be used to upload files to Docs. I wondered whether I could do the same thing in Python and automate the whole backup process…
I've been working on a system monitoring tool which needs determine whether the uptime of a Linux slave machine has changed since it last reported in. I looked through Python's online documentation and it turns out that there isn't a function among the standard modules for doing this (not even in the handy os module).
I did some searching around to see how people were getting the uptime of a host in Python and a surprising number of people advocate launching a subprocess and calling Linux's uptime command, then parsing the output. But there's a much better way!
Jumi is a handy Joomla extension that lets you include arbitrary code in Joomla without messing around more than you have to. I recently upgraded a client's install to Joomla 1.7.3 and tried to install Jumi, but got the following error:
Warning: constant() [function.constant]: Couldn't find constant JPATH_ in .../libraries/joomla/installer/adapters/module.php on line 115
Module install success
Plugin install success
Router install success
The extension mostly worked, but translations were missing. To cut a long story short, after some scraping around I found that the problem is fixed in Jumi 2.0.7, which is the version shown on the download page and the version I thought I'd downloaded. It turns out that the download link actually points to 2.0.6, which doesn't work with Joomla 1.7. I checked the trunk version in the SVN repository and it's tagged 2.0.7, so I tried it out and installation was flawless.
I've recently had the honour of being accepted onto the NAO robot developer programme. I'm more than a little excited. The NAO is an awesome little robot produced by the French company Aldebaran Robotics. It's designed for a wide range of uses and is available to educational establishments, RoboCup teams, and selected developers(!), with a planned public release in the next year or two.
The NAO is one of the most advance humanoid robots in the world and is pretty dexterous. Want it to dance? No problem! Do a little Star Wars reenactment? Easy! But with speech and face recognition and motion control built in as standard the Nao can do far more than just repeat a set of pre-programmed motions. The official development programme video has a few ideas. I think it can do more.
The game is afoot. The old is being swept away and replaced with new. Shiny new (X)HTML 5 has replaced XHTML 1.1. If anything looks out of place please badger your browser vendor to improve HTML 5 and CSS 3 support (or use Lynx, which always works). Unless I've made a mistake, in which case tell me ;-)
Plan Zero is served using a lightweight and flxeible MVC framework which I crafted in PHP and named the Neutrino Framework (small, light and fast!) after the subatomic particle. The framework was bourne out of the need for a platform upon which to serve websites with minimum effort while at the same time keeping access to PHP's advanced OOP features, something which most common PHP frameworks hinder somewhat. I may put Neutrino up on Github as an open source project to be picked apart and improved when I'm happy with its initial performance and features.
I'm now using CSS 3 fonts to improve the look and accessibility of the site. Fonts used at the moment include Bebas Neue for the headings and the amazing Kingthings Chimaera for the title font (if you're not familiar with the Kingthings fonts I suggest you check them out, the range of fonts is impressive and each has been crafted with skill and attention). The site will look a lot better when these fonts are correctly rendered so if you don't see them I suggest finding a more modern browser, it's worth doing. I used the wonderful @font-face Kit Generator from Font Squirrel to get these fonts into a usable form; if you haven't seen Font Squirrel before I recommend a visit, they've collected together fonts which are 100% free for commercial use and their font tools are great.