Archive for November, 2009

EC2 Instance Billing

Sunday, November 29th, 2009

So, I’ve been racking up lots of EC2 instance charges lately. I thought it was because billing is per instance launch, but actually it’s just because I’ve been using a lot of time :-P

Tried an experiment just now: launched a machine, shut it down, launched again, shut it down.

Compared with csv usage report download, and noted:
- only billed for a single instance for the previous hour
- report timezone is UTC, ie London time, GMT, approximately
- report is updated at the end of each hour, ie just now was period midnight to 1am, and the usage data for that time period was appended to the downloaded report from 1am

gnu screen: detaching and attaching console sessions in ssh

Sunday, November 29th, 2009

It is possible to do something like the alt-1,2,3,4,5,6 terminals from within an ssh session. Not exactly the same thing as gnu screen, but it’s the easiest way I can think of to explain gnu screen. An excellent description of gnu screen is at gnu screen.

I feel it could be really useful in ssh sessions, eg when connecting to ec2 sessions, because disconnecting the ssh session leaves the screen session running, and – and this is not possible with nohup for example – it is possible to reconnect to it! The syntax are a little non-intuitive – think vi shortcuts – and ‘man screen’ is for me rather incomprehensible, but the debian-administration article referenced above is I feel *awesome*.

US Census Bureau pop count just reached 6.8 billion

Saturday, November 28th, 2009

US Census Bureau

Tech support from friends: the road to Ubuntu?

Friday, November 27th, 2009

I used to be the guy people turned to when they have problems with Windows.

Actually: I still am, but I no longer know the answers to their questions, since I no longer have the same problems with them, so they’re stuck, and I’ve had three different people so far come up to me with Windows machines with broken wifis…

If I was using Windows, I’d have had the same problem most likely and would probably have found out how to fix it. Some tweak to the registry for TLS or WEP or something, and Ploof! It just works.

As an example of something I’ve fixed: in a hostel I was staying in recently, the wifi was very slow. Really slow. Insanely slow. I realized eventually – motivation is the mother of, well, looking into problems more deeply – that there was a machine on the network broadcasting ARP packets to everyone telling them that the router’s address was assigned to his machine’s mac address. He was routing the entire hostel’s traffic through his own, virus-laden machine. And it was virus-laden, because when I figured out which machine it was, the guy had no idea what was going on. Anyway, problem solved: just hard-code the mac-address of the wifi on one’s machines, or get the guy to clean his machine.

For the actual drivers on Windows, I’m sure there are tiny astuces to fix things, but I no longer know what they are.

If they were running Ubuntu, I’d almost certainly know what was going on, or: at least care enough to google and fix it. And their machine would work.

So, I feel that as the number of techie people start using Ubuntu, or linux in general, there will be an increasing advantage for normal end-users to have their machine migrated to Ubuntu: not because it’s a better user experience for them directly: but because they can then count on free tech support from the nearest techie person :-P

PS and to be fair it’s one of the reasons I personally switched to Ubuntu, at least, analagous: trying to install and run and get support for opensource software on Windows is frustrating and tedious. In Ubuntu, opensource software ‘Just Works’, it’s awesome. And no licence agreements – or few – or click through Eulas. sudo apt-get installs 97% of applications I feel, quickly, safely and easily.

ReactOS … and LUK … and and …

Friday, November 27th, 2009

It’s funny how my cool projects stand unknown by the sidelines for years, and then suddenly Ploof! they start to become visible.

It’s also interesting to me how many possible ways of doing something there are.

ReactOS. Just found out about it just now. It aims to completely re-write Windows. Not a compatibility layer on linux, a la Wine, but a complete rewrite from scratch. Sounds to me copyright-infringing somehow, note I’m not a lawyer, it’s just my off-the-top-of-my-head current opinion, please do not sue me over this or use it as evidence one way or the other :-P Actually, originally I wrote ‘patent-infringing’ in that previous sentence, but what has Windows actually done that is original? Windows was created by at least MacOS before it as far as I know, in my humble opinion. But copyright? I’m unsure… Anyway, it seems they have lawyers working on this stuff and that it *is* legal, so that’s very interesting. And cool.

Then, having just recovered from finding out that there is an opensource version of Windows heavily under development, I mean you can run it, it has a gui, even Paint works :-O, but no network for me, I found out about LUK.

LUK is the Linux Unified Kernel. Yet *another* way of tackling the problem. LUK sounds like wine: it is additional stuff on linux, but it implements the system calls at the kernel level, rather than in user-space, by pointing 0×80 calls to a table of linux calls, and 0x2e sys calls to a table of Windows calls.

What’s the difference between LUK and Wine? Unclear to me in practical terms. I guess: speed?

Wine works ok. Lots of things run in it. It’s not always very fast though. Starcraft on Wine is unplayable for me, even on a machine ten times more powerful than the machines I used to use ten years ago to play it.

A big proposed advantage of ReactOS is driver compatibility. Wine doesn’t really have anything to do with drivers I think, since it runs in userland, and drivers run in the kernel.

Unsure where LUK fits in in between these two extremes? Interestingly, it was initiated by a Chinese team Insigma Technology. I kind of think that is a good sign to the extent that Chinese technologies are I feel very pragmatic. There is little to no opensource culture here. If they’re working on this, I feel it is to make something that people in China will actually use. Perhaps they feel that, and I am guessing, purely speculating:
- LUK will be faster to develop than ReactOS, because it just wraps linux kernel functions for many things
- LUK is a more realistic substitute for Windows than a combination of linux and wine, because the drivers are readily available. I was going to say that it might need less wrestling with the system to get applications working than on wine, but I feel that’s nothing that systems administrators can’t do for end-users. Similarly I feel the end-user experience on linux is under-rated: my girlfriend has Ubuntu on her laptop (she has no choice :-P if she wants me to maintain it), and she has zero complaints about it. Really.

Ubuntu ‘Just Works’ :-O

Friday, November 27th, 2009

A flatmate came up to me the other day and asked him to fix his wifi. Well, I didn’t, just to get that out of the way, but: I looked at it, and thought ‘Windows. Ugh. I don’t remember how to use it any more, I don’t really want to remember, and it’s probably got a dozen viruses all fighting for ownership of the network stack…’.

I said, ‘Errr…. ok, I’ll take a look’. Turned the wifi on and off. No difference. Rebooted. No difference. Vaguely opened the network connections panel, and realized I had no real idea where to go next, other than Google and four hours of diagnosing someone else’s virus-ridden laptop, which is never one of my favorite occuptations…

I said ‘Hmmm, maybe the wifi itself is broken, the hardware, or maybe the Windows software has an issue’, stating the obvious really, a la Donald Rumsfield. The guy looked at me. I said ‘hmmm, give me a minute’. I was curious whether Ubuntu would work ok.

Grabbed my usb key containing Ubuntu Jaunty NBR installer. Plugged it in, rebooted the computer. Ploof! The wifi worked! Just like that. No tweaking necessary. Just clicked the wifi symbol and selected the network. Opened firefox. All worked ok.

Unfortunately for the flatmate, that’s about all I did. I mean, sure I could have let him play with it, but he would have started complaining that all his files were missing and stuff, and he’s Chinese, and our communications up to this point were mostly in hand signals, or in brief two-word utterances, so I just took the usb key out and shut the thing down.

But still: that the Ubuntu live usb ‘Just Works’ on a random Chinese cloned laptop is really cool I feel, and I’m looking forward to the day that my flat-mate begs me to install the ‘linux thing’ on his laptop :-P or at least copy it onto a USB key for him.

OpenID: secure logins = big win for me

Friday, November 27th, 2009

I’ve got OpenID activated for this blog, and I really like that when I put in my password, in McDonalds for example, over an open wifi connection, everything passes through encrypted, over SSL.

Sure, I could implement SSL myself on my websites, but it’s expensive, and *every* website I want to use needs it.

LinkedIn finally updated to allow SSL. Looking forward to them introducing OpenID at some point in the indefinite future.

1 TB of Memory in 1 Minute with 1 Command

Tuesday, November 24th, 2009

Really fun blog post:

1 TB of Memory in 1 Minute with 1 Command

It looks like there are instances available in EC2 with 68gigabytes of memory. That’s a lot! And they’re about 2 dollars an hour, which is affordable, for playing around, though one wouldn’t want to just leave one running all the time wihtout a good reason :-P

Eric Hammond ran 19 at once for fun, and points out that 19 x 68GB is about a terabyte!

Amazon EC2: ensuring instances shutdown when one is finished

Saturday, November 21st, 2009

I’ve been playing with Amazon EC2 recently. It’s pretty cool. It solves problems such as ‘I want to run Windows to test some stuff, what do I do?’ and ‘I need a calcserver to run whilst I’m working on my eeepc on the road, what do I do?’.

The cost per hour is pretty low, but if you leave an instance up for a few days or months, the charges will start to rack up.

If I was running off a desktop in my own house, I might create a screensaver that automatically shuts down the instances when it kicks in. Screensaver running: instances down. Screensaver not running: who knows?

But I’m not. I’m on an eeepc, in various Starbucks and so on, which is fairly comfortable, but raises the questions:
- what do I do if I start an instance and the network goes down? I can’t stop it any more!
- how to make sure that when I close the netbook – putting it to sleep straight away – the instances die?

My original solution was to make the instances themselves shut themselves down after an hour. It works, but:
- it means modifying the images, and storing them somewhere -> lots of work, you have to pay for storage space, and what if I forget, or there is a bug, or I don’t install it to the machine correctly?
- sometimes I don’t want them to shut down once an hour ;-)

My latest solution is to create a cron task on my web hosting, which shuts down my EC2 instances if I don’t have an ssh connection to the hosting.

Then, when I turn off my netbook / put it on stand-by, the ssh connection to my hosting will die, the cron job will kick in and the instances are guaranteed to die (guaranteed in the loosest sense of hte term of course….). It even sends me a nice email about it!

How does this work? There are three parts:
- cron configuration itself
- script which detects whether I’m logged in on a bash session
- script which shuts down the instances, and emails me about it

Cron configuration. Type ‘crontab -e’, and add a line like:

33 * * * * /home/youruseraccount/ac/terminateinstancesifnoconnections.sh

This will run the script at 33 minutes past the hour, every hour, forever.

For testing, you can change the 33 to *, which will make it run once a minute. Be careful, you can flood your email system, and use up lots of webhosting cpu ;-)

Next, the script that cron runs is:

#!/bin/bash

scriptdir=$(dirname $0)

psnum=$(ps -ef|grep 'b[a]sh' | wc -l)

if [[ $psnum != 2 ]]; then {
# echo 'other connections present'>&2
exit 0;
} fi

bash $scriptdir/terminateinstances.sh

Make sure it is executable (chmod +x), and is found at the location you gave to crontab earlier.

What is ‘grep ‘b[a]sh’? This came up in an interview I went to last week, and I just figured out the answer whilst writing this script! Basically it matches ‘bash’ but it doesn’t match ‘grep ‘b[a]sh’, ie it doesn’t match itself! which is neat.

The script looks for other instances of bash. The $() is one instance. The script itself is another, when run from cron. So that would be 2. If there are not 2, then I’m already connected to bash on an ssh session most likely. If it’s another cron job, it’s not a biggie, it’ll kick in again an hour later, unless every cron job is configured for 33 minutes past the hour I guess :-P

And finally, a script to terminate the instances and mail me:

#!/bin/bash

. $HOME/.bash_profile

instances=$(ec2-describe-instances | grep INSTANCE | awk '{print $2,$4}' | grep -v terminated | awk '{print $1}')
for instance in $instances; do {
echo terminating instance $instance >&2
ec2-terminate-instances $instance
} done

This assumes that $HOME/.bash_profile contains the environment setup for amazon EC2, ie EC2_HOME, JAVA_HOME, etc

What the script does is use ec2-describe-instances to get a list of instances, grep for all that don’t have a status of ‘terminated’, and then run a for loop calling ec2-terminate-instances for each one.

What about the email? Well, when you create a script in cron, anything you redirect to stderr, ie ‘>&2′ gets added to an email, and sent to you! If there is nothing, then no email, otherwise it all get concatenated together, and emailed to you at the end of the job.

So we can an email if and only if an instance is detected as not terminated and is terminated.

The crontab line makes it run once an hour, so if the ssh connection to the webhosting dies occasioanally, it’s not a biggy, it won’t instantly kill hte instances, but it makes sure they’re not going to run all night, or month, or decade… :-P

Purging spam from old mediawiki installationsuser

Saturday, November 21st, 2009

Leave a mediawiki installation unattended for a few years, and it will be devastated by spam when you come back…

Undoing the changes by hand sounds a pain, so what I did was:

1. get rid of the new spam users by doing:

- delete from user where user_id > xxx;

… where xxx is some number I determined after which there were no legitimate users

2. delete everything done by those users:

delete from revision where not exists ( select * from user where user.user_id = revision.rev_id);

One could also consider doing something like (but not exactly this, it is too broad):

delete from page where not exists (select * from revision where revision.rev_page = page.page_id);
delete from text where not exists (select * from revision where revision.rev_text_id = text.old_id)’ )

Note that on older versions of mediawiki, the page table is called ‘cur’.

You might also want to do some purging of the externallinks table.

3. To update the versions of the page you see when viewing, I did:

create table latestbypage as select page_id, max(rev_id) as latest from page, revision where page.page_id = revision.rev_page group by page_id;
update page, latestbypage set page.page_latest = latestbypage.latest where page.page_id = latestbypage.page_id;
drop table latestbypage;

This didn’t work perfectly, so there is something missing from the procedure here, but I’m not sure what. The main page didn’t show the latest version, or any version, I had to click on ‘history’ and click the latest version, then ‘save’, but the other pages were ok.

4. purge index

delete from searchindex;
You can recreate by doing php maintenance/rebuildAll.php

The schema is describe at:

Since it was an old mediawiki installation, I had to make a choice between upgrading first then purging data, or purging the massive amount of spam first then upgrading? I chose to purge first to make the upgrade shorter, but arguably this means the schema I was working off didn’t quite match that described above, and maybe explains some of the issues I saw with latest versions not showing correctly.

I recommend strongly making a back up of the database first, since I needed it several times!

USE THIS INFORMATION AT YOUR OWN RISK. IT IS FAR FROM PERFECT AND YOU REALLY NEED TO DO A BACKUP AND YOUR OWN TESTING ON A BACKUP/TEST DATABASE FIRST!

Oh, one other thing: to reset your admin password, do something like:

update user set user_password=md5(concat(’1-’,md5(‘mynewadminpass’))) where user_id = 1;