Nov 15, 2008

Once it's out, it's out

Have you ever said anything you wanted to take back right after you finished the sentence? Well maybe you got lucky and there were only a few people around. But once you put something on the web, it's there forever. Internet doesn't have concept of delete button.

There is always omnipresent cache and archives, so even deleting content from you site doesn't help. This happened recently when Apple pulled biography of their new executive Mark Papermaster from their website, after court barred him from reporting to work in Apple until his lawsuit with IBM is closed. I will not go into details (you can read Ars Technica coverage of the issue) because my point lies elsewhere. You can say what you want, if it is connected to the Internet it is public FULLSTOP

Internet is full of stories where people wanted to hide their humiliations and errors from public by injunctions, lawsuits and whatnot. The end result is almost always Streissand effect. If you read the wiki, there are some nice examples why you should keep your private things private. Once it's out, trying to censor it will only make it worse (the more famous/sexy you are the worse for you). It might be a good time to read guides to privacy right now. I know you are not going to do that anyway, but it is still my dream that once, a new generation will be able to protect their privacy online. Unfortunately anecdotal evidence suggest otherwise.

By the way. Anyone knows a simple list of things to improve your privacy online?

Share/Save/Bookmark
Nov 14, 2008

Earn money sending spam!

Seriously. According to joint study by security researchers, Storm botnet can create as much as $ 3.5M of revenue per year. It was definitely one of the most ingenious research and analytical papers I have read so far.

In order to measure effectiveness of spam campaigns, researchers joined Storm botnet with bots that were used to conduct MITM attack on Storm itself. These bots changed spam campaigns slightly and redirected targets of spam campaign (users) to servers controlled by researchers. These servers mimicked websites of spammers and counted number of visitors and number of actual victims who fell for the scams and provided their information (credit card number, social security number, etc.). If the results are correct, spam campaigns are effective in less than 0.00001% of cases. This number is indeed extremely low, but if you consider size of the Storm and number of emails that it sends every day, you get to more interesting numbers ranging from $7000 to $9500 of revenue per DAY.

I left out few interesting details so if you have some time, consider reading the whole paper (12 pages).

Share/Save/Bookmark

Xorg evdev madness

It is really astonishing how easy it is to find topics for blogging when one looks around :)

I recently upgraded my Xorg installation to latest ~x86 version. For Gentoo virgins, this means unstable version, although it is usually considered stable upstream, just integration with other apps can be sometimes problematic. Stable version was really old and had problems with recent kernel versions. I was very happy with the upgrade, which made my 5 year old Thinkpad more alive than ever. I decided to recreate my xorg.conf because most of the stuff that was there was not needed anyway, since XRandR 1.2 is used.

What is my problem then? Well after the upgrade some features of my touchpad stopped working (most notably circular scrolling) and I could not switch between different layouts of my keyboard. First thing I did was of course look at Xorg.0.log. Important part follows:

(II) XINPUT: Adding extended input device "AT Translated Set 2 keyboard" (type: KEYBOARD)
(**) Option "xkb_rules" "base"
(**) AT Translated Set 2 keyboard: xkb_rules: "base"
(**) Option "xkb_model" "evdev"
(**) AT Translated Set 2 keyboard: xkb_model: "evdev"
(**) Option "xkb_layout" "us"
(**) AT Translated Set 2 keyboard: xkb_layout: "us"
(II) config/hal: Adding input device ThinkPad Extra Buttons
(**) ThinkPad Extra Buttons: always reports core events
(**) ThinkPad Extra Buttons: Device: "/dev/input/event3"
(II) ThinkPad Extra Buttons: Found keys
(II) ThinkPad Extra Buttons: Configuring as keyboard
(II) XINPUT: Adding extended input device "ThinkPad Extra Buttons" (type: KEYBOARD)
(**) Option "xkb_rules" "base"
(**) ThinkPad Extra Buttons: xkb_rules: "base"
(**) Option "xkb_model" "evdev"
(**) ThinkPad Extra Buttons: xkb_model: "evdev"
(**) Option "xkb_layout" "us"
(**) ThinkPad Extra Buttons: xkb_layout: "us"

As it happened evdev found additional "keyboards" and IGNORED my layout settings for keyboard. I found few forum posts dealing with the same problem on Gentoo and Arch Linux. I will not go into details, if you really want to know all the crazy solutions people found, read the forums. But easiest solution? Uninstall evdev driver for now if you don't need it (you probably don't). Similar effect could be probably reached by adding Option AutoAddDevices "boolean" to Serverflags section of xorg.conf, however I didn't try this approach.

Share/Save/Bookmark

World is spinning too fast

And while it's spinning faster every day (perhaps because of her?) my blog themes are getting cold and old. I wanted to write about many topics but instead I was living my life. Go figure...So first let me just post simple summary of links I found worth reading in past weeks:
There were also few others but just like 2001: A search oddysey they became outdated some time ago.

There is however one article that sparked my interest more than others in past weeks. Title of the article is "Tips for getting started in information security". Why was this interesting to me? I have quite a few feeds in my RSS reader. Some of them are dealing with security, some with more general IT topics, some are just plain fun. My problem is that I like security as much as I like software development. It is however not that easy to find basic-level stuff that is dealing with application security. When I read about attack on Adobe Flash virtual machine my head started turning. I know thing or two about stack, buffer overflows etc. but this is just too much for me now. So I decided I have to change my approach a bit and start catching up on application security. Otherwise I will just turn to one of those old school wannabes that actually know something about everything nut not really everything about something.

Unfortunately I don't suspect I will have much time in upcoming days for blogging but we'll see.

Share/Save/Bookmark
Oct 25, 2008

GStreamer bug hunting

While I was working on GSTFS (from my previous post) I also managed to stumble on a bug in GStreamer. Of course at the time I didn't know it was a bug since I was GStreamer greenhorn (I actually still am :) ).

What was it about? Well when I was converting mp3 files with pipeline
[source ]! decodebin ! audioconvert ! lame  ! id3v2mux ! [output]
Id3tags were lost in conversion. This was a real bummer for me, since my player fully supports id3v2.4. When I read through the docs, man page and I could not find any solution I did what every self respecting geek would do. I fired up my IRC client and joined #gstreamer on freenode in search for help. After a few advices that didn't work we closed the issue with me filing a bug caused by unknown reason. Originally the bug seemed to be part of demuxing problem with mp3s (ogg files didn't have this problem).

After some time, I finally had a revelation. What made my mp3 files special? Nothing. Except that all of them had ReplayGain tags calculated with mp3gain. I removed the ReplayGain tags, tried the same gstreamer pipeline as before and....Voila! My converted files were not missing id3tags anymore. The bug filed in Bugzilla is not closed yet, but now the developers at least know where to look for it.

Now I have a request for all 2 people reading my blog (including me :) ). If you find a bug in opensource (or free, whatever you want to call it) software, pleeeeeaase report it. Ideally you first make damn sure you are actually not filing PEBKAC so that developers don't waste their time. If you can help with the testing of the fix. If you are not skilled enough you can always go to the IRC channel or send and email and ask if maybe documentation needs clarification or other relatively easy chores need to be done. There's always stuff to do. Just read the post 5 Ways to Contribute to Open Source Projects Without Coding.

Share/Save/Bookmark
Oct 24, 2008

Opensource participation

In my previous post I mentioned project to transparently convert media files uploaded to my mp3 player. At first I wanted to create my own project from scratch. But then I searched around on the net and FUSE wiki and I found Gstreamer filesystem or GSTFS for short.

GSTFS is based on GStreamer multimedia framework to handle media conversions and FUSE to create virtual filesystem. Because of GStreamer, simple changes on command line allow you to do almost any task concerning media files. Conversion of music files, videos, resizing of pictures and more.

I started playing with GSTFS trying to convert my music collection to lower bitrate mp3s. Simple 'cp -R * music/ music_converted/' should have worked. But it didn't. Why? Well GSTFS shows non-converted files as 0-sized files. And cp tried to optimize copying by actually not copying empty files. It doesn't even try to read them. That meant running cp twice since second run would see actual sizes. Even then there is a problem with expiration of file cache so if your music collection is more than a few files, you are out of luck.

And here we come to great advantage of opensource (at least for me). Source code of GSTFS is available, so I fixed this small bug and send a few line patch to original author Bob Copeland. I also asked if he could perhaps create public repository of sources on repo.or.cz. Interestingly enough I was aparently not the only one to ask for it. And so, lo and behold, Git repository of GSTFS is online. You can now find my work on improving GSTFS in mob branch in the repository. Hopefully I will be able to contribute more to this great idea and my code will actually make it into the main branch :)

Share/Save/Bookmark
Oct 16, 2008

Sound quality is relative

We all know that, right? Right. Who am I kidding? Most people don't notice the difference between 96kbps 4x re-encoded mp3 and FLAC. Of course it depends on your setup. Headphones, mp3 players. All of it makes your experience better (or worse).

Since I bought my first Koss Porta Pro headphones I realized that there are headphones and HEADPHONES. With some you don't even hear the important parts, while others make you realize how much noise there is in the source :). And so my music collection is now mostly flac, high quality ogg or vbr mp3. Normally I don't care about the size of music files. Storage is not that expensive these days. But I have an old Cowon U3 music player (still cannot find anything better) with only 2GB of memory. Of course that's where size comes into play. What I usually do is convert music to lower bitrates (160 kbps vbr usually) before transfering them to the player. But I don't keep those converted versions around since I don't have that much space lying around :). So I am wasting time chosing files to transfer, then converting them and finally copy them to player.

Manually converting and then transferring files is kind of a bummer though. Now I realized...I can actually program right?! So how about making a FUSE virtual filesystem on top of vfat filesystem on the player. This virtual filesystem would convert music files being copied to the filesystem to specified format in background. Processor speeds are fast enough to do this more-less realtime these days so why not?

How will this affect my workflow? Compare:

NowVirtual FS
  • Copy files to temporary directory
  • Convert big files to lower bitrates
  • Copy files to mp3 player
  • Copy files to mp3 player
So far this is just an idea. I don't know of any other project doing the same thing (there are a few dealing with general data compression but not media files specific).

Expect more to come (just don't expect deadlines :) ).

*EDIT* As it happend there are already FUSE project that do exactly what I had in mind. I guess I should check the page more often. The projects are GSTFS and MP3FS where the first one seems more promising and flexible.

Share/Save/Bookmark
Oct 11, 2008

We need CAPTHHA

I am pretty sure everyone has seen CAPTCHA (Completely Automated Public Turing test to tell Computers and Humans Apart) before. Maybe you didn't know the (full) name but you have encountered it when registering accounts, posting comments or accessing some parts of web. You know, those annoying things that are exercising your ability to recognize distorted words with weird backgrounds.

CAPTCHAs are used to protect against automated attacks. For example automatic registration of new users on Gmail would create great opportunities for spammers. CAPTCHAs are mostly working, even when they get "hacked" from time to time. The biggest problem? They are reaching levels where even humans are having problems reading the letters. I still have nightmares when I remember CAPTCHAs used on RapidShare. Telling cats from dogs was not that easy for me somehow. I am not sure about "hackability" of reCAPTCHA, but as far as usability goes, it's one of the best ones for me. Too bad only a few sites are using it.

The main problem of CAPTCHAs is not the complexity but relay attacks and human solvers from 3rd world countries paid for solving thousands of CAPTCHAs a day. What we really need is CAPTHHA (Completely Automated Public Test to tell Humans and Humans Apart). Computer science is far from being able to tell humans with "clean" intentions from those being paid to get past the defences. One solution would be to issue certificates of "humanity" signed by central authority. You could then ban users that were misusing their certificates. There are of course privacy and security problems with this approach, not to mention financial "issues", so I guess this is not how it's gonna work.  Other approaches have also been tried, but they usually have problems with disabled people. I am certainly interested how Computer Science solves this problem.

Share/Save/Bookmark
Oct 3, 2008

GFuture

I'm slowly starting to feel like Google Fanboy(tm), but Big G has made an interesting announcement recently. Dubbed "Clean Energy 2030", proposal tries to encourage several ways to achieve usage of "clean" energy by year 2030. I suggest you read it, especially if you like sci-fi. Basically they suggest 3 complementary things to do:
  1. Reduce demand by doing more with less - in other words energy efficiency.
  2. Develop renewable energy that is cheaper than coal (RE<c) - concentrate on solar, wind and geothermal energy.
  3. Electrify transportation and re-invent our electric grid.
Of these, first two seem OK. But electrifying transportation? Especially in US where you cannot buy car with engine less than 2000cc? I will watch closely. I still remember those sci-fi movies that showed flying cars in year 2000 and I am still dissapointed there are almost none.

I would love to see future come sooner, possibly while I'm still alive, but I am a little bit sceptical. Google might chip-in with generous 45$ mil this year, but will goverments follow? I doubt it. Still, hope dies last. I still have this dream of Earthlings being one big nation where it doesn't really matter which part of Earth are you from. It just matters you are not from Qo'noS or Minbar. And this "cheap energy for everyone" initiative reminds me of these dreams. Oh well, one can dream.

Share/Save/Bookmark
Sep 24, 2008

Dropbox

I had this in my "almost-finished-near-ready-to-publish" folder for some time already. Past week was again a little crazy in my personal life, so no real time to finish this small piece until now... :)

Do you frequently switch between two computers not connected with local network? If so, I guess you wanted to share data between them at least once before. It used to be a hassle. Now it's easy. Dropbox started public open beta-testing of their service few weeks ago. If you haven't heard of Dropbox here is my little intro. Dropbox is essentially centralized version tracking accessible from anywhere without need to configure anything. You copy files you want to share with other machines to your Dropbox directory and they are automatically uploaded to Dropbox server. If another machine on the other end of the world is running with same Dropbox account, it is automatically synced. If it sounds confusing, I encourage you to read the introduction tour on their website. Free account enables you to use 2GB storage and unlimited bandwith, so it's not that bad. Most of all, it "just works(tm)". And you can later upgrade to Pro versio with 50GB space for $9.99/month or $99.99/year. I am not sure about availability outside US, but I guess that's not gonna be a problem.

You can synchronize files between Windows, MacOS X and Linux machines.There are still a few rough edges, but I guess that's why it's beta :)..It would be really nice if the protocol for communication with Dropbox server was made public, but I guess I am asking for too much. At least the Nautilus interface in Linux is GPLed and there are already alternative "clients" for  retrieving status of your Dropbox account.

Good thing is that you can also share files with the rest of the world. Just like you would with for example Rapidshare account. The difference? No limits on file sizes (so far, as far as I know). I just wonder how will they fight sharing of illegal data.

With services like this privacy is always a concern. You give up certain amount of privacy by uploading your files to 3rd party server. So whatever you do, be sure to encrypt your private files. Happy sharing.

Share/Save/Bookmark
Sep 16, 2008

Stackoverflow launched

If someone actually read my previous posts (heh), (s)he may have noticed that I quite often link to www.codinghorror.com. It is Jeff Atwood's blog, and I usually find it very exciting to read. His style of writing and ability to convey complex messages in a simple way is my holy grail. And if he is not able to do it himself, he links to other authors A LOT. Instead of repeating same thing that has been said over and over again he just links to proper post made by some other fellow programmer/software engineer. Avoiding duplicity is really one of basic goals of programming. Instead of repeating same code 20 times, just write a function and call it 20 times.

But that is just low-level stuff. Jeff, Joel Spolsky and few others, embarked on an adventure to get rid of duplicity in minds of programmers. Programming is so inherently complex that no one really knows solution to every problem. And don't get me started on optimal solutions. What did you do when you found solution to some programming challenge, or some tricky workaround for problem that was bugging you for weeks? If you have a blog, you could go and post your solution there. Maybe someone would notice. Maybe not. So what did Jeff & Co. do? They created and launched ww.stackowerflow.com. Quoting from about page:
Stack Overflow is a programming Q & A site that's free.
As is often the case, powerful ideas come in simple packages :-). It is that simple. You ask, others answer. Then you vote and best answer wins. It's kind of expertexchange.com, just without the paying part, and with better user participation. Users who have proven themselves to be worthy can earn karma points and thus become more-less moderators. Try it out, and you will see what I mean. Begin with reading their FAQ.

Share/Save/Bookmark
Google released its own Web browser called Chrome few weeks ago and whole web was buzzing with excitement since then. They did it Google style. Everything is neat, clean and simple. And quite a few features are also unique. Google engineers obviously put a lot of thought into scratching their itches with web applications. Javascript engine is fast and whole browser is created around the idea that web is just a place for applications. One of the most touted things about Chrome were its security features. You can read whole account of basic Chrome features on its project page.

In Chrome each tab runs as a separate process communicating with main window through standard IPC. This means that if there is fatal error in handling of some page (malicious or otherwise), other tabs should be unaffected and your half-written witty response to that jerk on the forum will not be lost. Chrome also has other security enhancements, that should make it more secure. I said should. Within few days of Chrome release several security vulnerabilities surfaced, ranging from simply annoying DOS to plain dangerous remote code execution.

What caught my attention was bug that enabled downloading files to user's desktop without user confirmation. It was caused by Googlers using older version of Webkit open source rendering engine in Chrome. Integrating "foreign" software with your application can be tricky, especially if you have to ensure that everything will be working smoothly after the upgrade. In that respect, it is sometimes OK to use older versions of libraries. As long as you fix at least security bugs. People write buggy software. Google engineers included. I am just surprised that they don't have any process that would prevent distribution of software with known security vulnerabilities to the public.

And that is the main problem. Chrome is beta software. Because of this, bugs are to be expected. But Google went public with Chrome in the worst possible way. They included link to Chrome download page on their home page, making hundreds of thousands of people their beta testers. People who have no idea what "beta testing" actually means. They just know that Google has some cool new stuff. So let's try it right? Wrong. Most of us expect our browser to be safe for e-banking, porn and kids (not necessarily in that order). Unfortunately Chrome is not that kind of browser. Yet. I am pretty sure it is gonna be great browser in the future though. But right now Google should put big red sign saying "DANGEROUS" in the middle of Chrome download page.

Until Chrome becomes polished enough for Google to stop calling it "beta", it has no place on desktops of common computer users. Even oh-so-evil Microsoft doesn't show download link for IE8 beta on their main page to promote it. Mentioned issues aside, Chrome really sports few good ideas that other browsers could use as well. Try it out, and you will like it. Then go back to your old browser for the time being.

Share/Save/Bookmark
Sep 11, 2008

Google copying ideas?

Google's Marissa Mayer (head of Search Products & User Experience dep.) today wrote blog post about current limitations of search and possible future improvements. All in all very interesting article where she compares current search to biology of 16th-17th century.
[search is] a new science where we make big and exciting breakthroughs all the time. However, it could be a hundred years or more before we have microscopes and an understanding of the proverbial molecules and atoms of search. Just like biology and physics several hundred years ago, the biggest advances are yet to come.
I can only concur. Search is relatively easy for tech savvy people. But the common mother of three will have problems formulating her search queries and picking right keywords for the job. There is still a lot of work ahead of Google it's search boffins.

What made me write this article though was this excerpt:
Our presentation is still very linear (the results are just a list) and even (no one result is more important or larger than the next). What if the results page began to transform radically to really harness these different types of results into something that felt much more like an answer rather than just 10 independent guesses? What if results pages pulled the best media together and laid it out such that the most useful content was not only first but largest? What if we laid out content in columns to use more of the width available on newer, wider screens?
Does it remind you of anything? To me it does. Few weeks ago there appeared a new player in search engine wars. It's name is Cuil. It does exactly the things that Mayer is thinking about changing. Multiple column results, (mostly) relevant media added to search results and completely different layout. Google has lot of smart people, so I would not be surprised if they were working on revamping Google homepage completely for some time. But the timing of these ideas is not very convincing for me. In the end it's the end user who wins because we should not care about the search engine, but the results.

Share/Save/Bookmark
Sep 10, 2008

Stumbleupon password policy

I already wrote one post about passwords few weeks ago. As much as we would like to, passwords are not going away in foreseeable future. But it seems I found something worth mentioning again :)

Recently I started using stumbleupon. For those who don't know this site I provide short description from their main page:
StumbleUpon discovers web sites based on your interests. Whether it's a web page, photo or video, our personalized recommendation engine learns what you like, and brings you more.
It's basically social networking site for link rating and exchange. It's a nice way to discover yet unknown gems of the Interweb. Just stumble around :)

Here's what sparked my interest. After registering with the site I received following email:

StumbleUpon

Discover new web sites

Hi xxx,
Thanks for joining StumbleUpon! Please click here
to verify your email address:


http://www.stumbleupon.com/verifyuser.php?email=3Dxxx%4=0gmail.com&verification=3Dd6z505kjmtjox3


Here are your login save this information and
store it securely:


Email: xxx@gmail.com

Password: MY PASSWORD IN CLEARTEXT

...
...

What the hell are they thinking? Sending cleartext password through email is not acceptable for quite a few years now, especially for large public websites. There are other options when users forget their password, for example:
  • resetting password to random one that is usable only once,
  • using control questions, i.e. "What was the name of your first pet?". They are not very secure, but still better then cleartext passwords.
  • lots of other options (google training for the readers :) )
Maybe they count on Stumbleupon being low-risk site, where losing account is not dangerous to your online identity. But they obviously forgot that most users use the same password over and over again. So their password for Stumbleupon will be the same as for their Gmail account, and that will be the same as xy other passwords. I am only fortunate that I stopped recycling passwords long time ago. Shame on you Stumbleupon!

Share/Save/Bookmark

End of the world is not here yet

Hooray! The world didn't end today. If you've been living under a rock (or you are not interested in these things :) ) then you may have missed that today morning the LHC started working. Goal of the whole project is to create really small Big Bang.

You can easily guess why some people consider these experiments dangerous. The general consensus among scientists was that it's safe. But not everyone is sure. It's almost like with first tests of nuclear weapons. Edward Teller, Hungarian scientist, was concerned that nuclear testing in atmosphere could ignite it, and burn everything (I mean EVERYTHING). The speculation was later refuted by more-less mathematical proof, that it's not possible. I would say that in the end, LHC can be as important for advances of human race as was Manhattan Project. Yes, I know that they created atomic bomb, but by doing so they started revolution in nuclear energy and certainly other research areas that were not possible before.

Anyway, read at least the Wikipedia article about LHC. It's really worth it. Or even better, I think that BBC had a LHC documentary, go watch it.

Note: I will finally have time to write some more posts today hopefully. I was too busy living my life for the past week :)


Share/Save/Bookmark
Aug 29, 2008

Keepin' it short

I am wondering how should I write this blog. For example previous post about false sense of security was supposed to be a longer rant originally. In the end I shortened it considerably and focused on Perspectives. Why?

Well the main reason is, that I realized that there are a lot of essays/rants about perception of security. Maybe I could write a really good essay if I took some time to think it through. But if you are interested in privacy or security then you already know all these things. And if you are just another ordinary computer user who does not care if his email account gets cracked, my blog will not change that.

Actually Jeff Atwood wrote some time ago about unreachable type of software engineers (or for that matter any professionals). Paragraph that completely describes my feelings is this one:
The problem isn't the other 80%. The problem is that we're stuck inside our own insular little 20% world, and we forget that there's a very large group of programmers we have almost no influence over. Very little we do will make any difference outside our relatively small group. The problem, as I obviously failed to make clear in the post, is figuring out how to reach the unreachable. That's how you make lasting and permanent changes in the craft of software development. Not by catering to the elite-- these people take care of themselves-- but by reaching out to the majority of everyday programmers.

Writing for the masses is not easy, and I don't think I'm up to it. Yet. It makes me angry that not everyone loves his job or profession, but I cannot change that. So I will keep writing about my passions the way I see fit and hopefully one day, I will become good enough software engineer and writer in one person, that I will be able to influence the 80%.

Note: If you are wondering what's up with 80% - 20% thing, then I recommend article by Ben Collins-Sussman about two types of programmers

Share/Save/Bookmark
Aug 28, 2008

Lack of security is not a problem

False sense of security is. As Dan Kaminsky pointed out recently, there have been numerous BIG security problems with fundamental Internet services. All of them undermine basic principles on which Internet is based: routing or DNS.

Can we trust the other side? How can we know that we are "talking" to the same computer as few days ago? This question is usually answered by encryption of communication and authentication through SSL (https). Most websites use self-signed certificates, but these provide only encryption, not authentication. There are quite a few good examples of security pitfalls of self-signed certificates.

Recently I also managed to stumble on nice Firefox extension called Perspectives. Usually only your browser checks security certificate of https server you are connecting to. If attacker takes over path between you and destination server, trying to execute MITM attack, Perspectives would detect this and warn you. It would even warn you if the certificate changed recently. This makes even self-signed certificates somehow more secure. Without Perspectives you could be easily lured in a den of wolves. For more in-depth explanation on how Perspectives works, see original publication.

The basic principle still stands. You are most vulnerable, when you don't expect an attack. In other words:
Little paranoia never hurts
So next time you see a warning about invalid/outdated/self-signed certificate, don't accept it without thinking about consequences.

Share/Save/Bookmark
Aug 26, 2008

Strong passwords suck, but they don't have to

Amrit Williams wrote a nice piece on sucking passwords. But as Martin McKeay pointed out Amrit didn't provide any real solutions except maybe using passphrases. Passwords are gate to online existence of most people. Most people know that there are certain rules for creating strong passwords (at least I hope so). But only a handful of people use really secure passwords. Moreover you should have different passwords for every program/email account/social networking site/etc. Why? So that when one account becomes compromised (by whatever means), others will stay safe.

You can find a lot of rules for chosing good passwords all around Internet. There is only one problem with them. If we would like to really follow all the rules, most of us would end up with 20+ passwords, every one longer than 8 characters, most of them without any meaning. Good luck with remembering them. But hey! We are in computer age, we don't have to remember stuff anymore right? Why not use a decent password manager? Then you have to remember only one password (but it better be REALLY secure).

This approach creates one more problem for us though. Mobility of our passwords. You want to access website x.y.com? I hope you have your password manager with database at hand. Otherwise you're screwed. I see two solutions:
  • If you use some kind of UNIX-like system, and you have a public IP, you could use command-line password manager to access your passwords from anywhere.
  • Carry your password manager with your database around.
I like the second method more because you don't have to worry about firewalls, proxys and similar stuff.

Recently I found out about PortableApps. It's a set of open source applications designed to be run from USB thumb drive without leaving anything behind after you close them. No registry changes, no temporary files etc. One of applications offered is KeePass Password Safe. It uses AES encryption to securely encrypt database of passwords. This Windows-only set of applications provides means to have strong, unique passwords that you can carry around with you. So what are you waiting for? Make them unique!

Note: I tend to use gpass password manager (Unix-only, but I usually have access to my machine) and I remember most important passwords by heart. I'll probably migrate to some other multiplatform solution soon (maybe PasswordSafe?)

Note2: Apparently there is similar (or even better) software for MacOS X (1Password) I haven't tried it though.

Share/Save/Bookmark
Aug 23, 2008

Developer isolation

I recently stumbled upon blog post about TraceMonkey (thanks to Sisken). TraceMonkey is codename for new improvments to SpiderMonkey (Firefox Javascript engine). Results are very impressive, with speedups ranging from 2x to more than 20x. I love Firefox and I'm looking forward to every new version bringing more exciting features. But what struck me most in the post was this statement:
I fully expect to see more, massive, projects being written in JavaScript. Projects that expect the performance gains that we're starting to see. Applications that are number-heavy (like image manipulation) or object-heavy (like relational object structures).
Now don't get me wrong. I get excited about new features just as much as every other geek :). I see a problem here though. Firefox is biting more of market share pie every month. But however we put it, it's still at most at 30% in some parts of Europe (US is dominated even more by IE). So how can Firefox create incentive for developers to create web applications for ONE specific browser? Sure, few years from now Javascript performance will be much better in other browsers too. What until then? You think that "Sorry, this site was designed for Firefox 3.1 or higher" is any better then "Sorry, this site was designed for Internet Explorer 5.0 or higher"?

You may ask "What about in-house applications, for one company?". In-house applications are already dominated by IE and ActiveX. That's not gonna change overnight. Or maybe I'm wrong.

GDevs (Geeky Developers) are rightly proud of their creations. The problem is when they fail to see the surrounding world. Now almost famous blog post from Ben Collins-Sussman about two types of programmers contains this pearl:

Shocking statement #1: Most of the software industry is made up of 80% programmers. Yes, most of the world is small Windows development shops, or small firms hiring internal programmers. Most companies have a few 20% folks, and they’re usually the ones lobbying against pointy-haired bosses to change policies, or upgrade tools, or to use a sane version-control system.

Shocking statement #2: Most alpha-geeks forget about shocking statement #1. People who work on open source software, participate in passionate cryptography arguments on Slashdot, and download the latest GIT releases are extremely likely to lose sight of the fact that “the 80%” exists at all. They get all excited about the latest Linux distro or AJAX toolkit or distributed SCM system, spend all weekend on it, blog about it… and then are confounded about why they can’t get their office to start using it.

Fortunately for OpenSource community, people like John Resig, Andreas Gal, Mike Shaver, and Brendan Eich are in the 20% crowd. Let's just hope they won't lose sight of the rest of us :)
Share/Save/Bookmark
Aug 19, 2008

How to change author of git commit?


I recently needed to rewrite history of commits in a private
Git repository, because I wanted to change my email address in every commit. Note that you should not try following tip in a repository that anyone has pulled from. Normally Git doesn't allow you to do this kind of thing, since changing authorship is...well bad (usually also against the law).

Let's assume that email address changed from dev@comp1.com to dev@comp2.com. To create copy of repository $PROJECT_DIR to new repository $NEW_REPO with changed emails we can do following:

$ cd $PROJECT_DIR # change to project repository

$ git fast-export --all > /tmp/project.git # export repository to temporary file

$ sed 's/^author\(.*\)<dev@comp1.com>/author\1<dev@comp2.com>/' /tmp/project.git # replace emails on every line starting with 'author'

$ cd $NEW_REPO # change to empty directory

$ git init # initialize git

$ git fast-import < /tmp/project.git # import modified repository

Third step is potentially dangerous, because you have to make sure that you don't edit contents of any file aside from metadata. If you change content of files, git fast-import will complain because sha1 hash will not be correct.

Be sure to read git fast-import and git fast-export man pages for additional information. It took me a while playing with git-rebase and similar stuff to realize that they do not offer such feature, so if this tip helps anyone else I'll be glad.
Share/Save/Bookmark

Picasa Album Downloader roadmap

In my first post about Picasa Album Downloader java applet I promised more in-depth technical information about the project.

Project idea came when few of my less computer savvy friends wanted to download all photos from my Picasa Web Album. So far there have been few different ways to do that:
  • install Picasa and use it,
  • install some other software to download photos,
  • go photo-by-photo and download them one-by-one.
None of those methods is very user friendly. Why isn't there a "Download album as zip archive" link on Picasa? I have a few theories, but that's probably for another blog post :)

Question is: How to enable users to download Picasa albums easily? Apparently I was not the only one with the idea of creating web service to create zip file for users to download. Fortunately Google provides APIs for most of its services in few languages. More precisely you can access Picasa easily using:
  • .NET
  • Java
  • PHP
  • Python
Since I started learning Python step-by-step few months ago, I thought about using it for the job. Then I realized that I will need hosting for the web service. There are not too many free python hosting services. Those that are free usually have some restrictions.

Even Google provides hosting services using its own App Engine, with support for Python in particular. I created simple prototype python script that was able to download selected album from given user to the chosen output directory. It ran just fine when I was testing it, but stopped working when run inside dev_appserver.py. Reason? Hotlinking.

Picasa Web Album checks referer header and if it doesn't match one of Google domains, Picasa blocks access to the images. Since App Engine dosn't allow clearing of referer header, this effectively blocks using full scale images from Picasa in App Engine. So python is out. What else is there?

I don't have much experience with .NET, and I also don't think that it would be suitable for web application that is supposed to be free. I already had some experience with PHP and for project like this one, it would probably do the job just fine. There was a problem though...Google Data APIs needs at least PHP 5.14 to work, but hosting services I had at my disposal had lower versions installed.

Status? Python, .NET, PHP, Java. And here we are. The result is a Java applet that enables users to download full Picasa album without installing any software. There is also a project page at Google Code. First version took about 1 day to code. I released it under GPLv3, so if you want to contribute, you are welcome to do so. If you find any bugs or have ideas how to make the applet better, let me know.

Share/Save/Bookmark
Aug 14, 2008

Picasa Album Download

Picasa web albums is a great service. As far as I can tell it has very few disadvantages over competing websites. Although I have never used Flickr or similar services so I am not really one to judge.

There is one thing with Picasa web albums that quite a few people have asked me:
Can I download whole album from Picasa at once, without having to click through all the photos one-by-one?
Well I used to tell people to install Picasa to their computer, but less tech-savvy users had problems with this approach. Some companies also have restrictions on installing software in their networks. No wonder with numbers of trojans, malware and similar things on the Internet these days. Getting rid of them can take forever...

I found quite a few projects dealing with downloading from picasa. All of them required installing some application (or at least download one). Perfect solution? Web service.

As an aspiring Software engineer (pun intended) I set up on a quest to solve this problem once and for all. Goal:
  • Download complete Picasa Web Album into computer without having to install anything first
  • Multiplatform (Windows, Linux, MacOS X,...) support. Ideally only browser-requiring solution.
Simple right? Well yes and no. I will publish technical details and solutions I tried in some other post (edit: I already did). Now. without further ado, I present to you:


Share/Save/Bookmark

Lorem ipsum?

I created this blog some time ago, but so far I did not create any posts. Why? Well I am reading a lot of blogs and the best ones are usually those that have certain criterias:
  • interesting content,
  • interesting form,
  • lot of interlinking to other sources of information,
and most importantly: they are updated regularly.

Can I do all of those things? Well, we'll see... If nothing else this will be nice to read in 20 or so years (If I manage to keep it up for at least a few months :) ) I will try to update this blog twice a week and we'll see how that goes.

Topics that I will most probably write about include:
I will most probably get a lot of ideas from other blogs that I read. If you just want to get an idea what I am interested in you can always look at my shared google reader feeds.

So long and thanks for all the fish.

P.S Oh yeah...one more thing. English is not my primary language so there may be occasional "hiccups". Sorry

Share/Save/Bookmark