folding, part i
In a post I wrote a while ago I mentioned something called "distributed computing" and its application in fields like medical research. The specific initiative that I was referring to was the Folding@Home project run by Stanford University. Now that I've started folding regularly, I figured that I should explain a little more of the concept and what's involved. I'm actually really hoping this post isn't ignored out of boredom or perceived technical complexity, because I think the project is a really amazing and important one, and that awareness about it should be spread.
Distributed computing is actually fairly straightforward. We're all familiar with the concept of the supercomputer, a uniquely powerful and specialised machine that handles complex or involved computational tasks. There are a number of fields (e.g. in scientific research) where enormously powerful computers like these are required. Of course, this kind of equipment can cost a lot to operate and maintain, and even then there are general limitations on how much processing power can be crammed into a single working machine. Basically, distributed computing works around these problems by creating a "virtual" supercomputer, which in actuality is a whole bunch of less powerful computers working together, and by "distributing" the workload across this network in small-enough chunks that even these less-powerful computers can process them.
It seems, then, like distributed computing trades off one expense for another: an institution like Stanford waives the cost of having to maintain supercomputers, and instead that cost -- the initial cost of equipment, and the upkeep cost of electricity -- gets pushed on to the people who participate. But it really isn't that sinister, because the reality is that there's an enormous amount of energy waste on most home computers. That's to say that most computers these days are fairly powerful, and are a lot more powerful than they need to be to do the things we want them to do (word processing, browsing around on the internet, chatting).
To underscore the point: if you're running Windows, you can open up your "Task Manager" (if you don't know what I'm talking about, just ignore this part) and see, under the "Performance" tab, a graph of how hard your CPU (the brain of your computer) is being worked.[1] Chances are that 95% of the time, you aren't using more than 5-10% of your CPU power (I actually rarely ever exceed 1%), and that a huge portion of your computer's computing potential isn't being utilised. Distributed computing taps those spare resources and puts them to good use.[2]
So that's distributed computing, in a nutshell. Folding@Home is a specific project run by Stanford's chemistry department. Its purpose is to run simulations of protein folding, in order to gather data about the folding process and to analyse related phenomena. I can't go into a full explanation (partly because I want to keep this concise, but moreso because I don't fully understand myself -- all I know is what little I've read online and extracted from my dad over the dinner table), but the research is directly linked to studies of diseases like Alzheimer's, Parkinson's, and even some cancers.
There's so much more to say about it, but I'm probably not the best person to say it. A wealth of information is available on the Folding@Home website, including instructions on downloading and using the Folding client. The instructions might be a bit technically involved, though -- so if any of you are interested but can't quite figure out how to make it work, you can ask me any questions and I'll be happy to try to help.
This post is a preliminary one, in which (I hope) I've clearly outlined the concept and mechanics of distributed computing in general, and Folding@Home in particular. I'll follow up in a few days with another post on some of the broader ideas and issues I've been reflecting on in relation to this project.
Notes:
[1] Ctrl+Alt+Delete -> Task Manager (unless Ctrl+Alt+Delete already gets you to Task Manager), and then click on the tab labelled "Performance".
[2] Now, granted, when your computer isn't working as hard it doesn't use as much electricity; and so when a distributed computing program is running on your computer, the end result is a higher power draw out of the wall socket. Still, and even if power usage and associated costs (financial and otherwise) are higher, there's a huge difference in efficiency -- a distributed computing program will make use of way more of the energy that your computer is eating up than your average, mostly-idle usage will.
Distributed computing is actually fairly straightforward. We're all familiar with the concept of the supercomputer, a uniquely powerful and specialised machine that handles complex or involved computational tasks. There are a number of fields (e.g. in scientific research) where enormously powerful computers like these are required. Of course, this kind of equipment can cost a lot to operate and maintain, and even then there are general limitations on how much processing power can be crammed into a single working machine. Basically, distributed computing works around these problems by creating a "virtual" supercomputer, which in actuality is a whole bunch of less powerful computers working together, and by "distributing" the workload across this network in small-enough chunks that even these less-powerful computers can process them.
It seems, then, like distributed computing trades off one expense for another: an institution like Stanford waives the cost of having to maintain supercomputers, and instead that cost -- the initial cost of equipment, and the upkeep cost of electricity -- gets pushed on to the people who participate. But it really isn't that sinister, because the reality is that there's an enormous amount of energy waste on most home computers. That's to say that most computers these days are fairly powerful, and are a lot more powerful than they need to be to do the things we want them to do (word processing, browsing around on the internet, chatting).
To underscore the point: if you're running Windows, you can open up your "Task Manager" (if you don't know what I'm talking about, just ignore this part) and see, under the "Performance" tab, a graph of how hard your CPU (the brain of your computer) is being worked.[1] Chances are that 95% of the time, you aren't using more than 5-10% of your CPU power (I actually rarely ever exceed 1%), and that a huge portion of your computer's computing potential isn't being utilised. Distributed computing taps those spare resources and puts them to good use.[2]
So that's distributed computing, in a nutshell. Folding@Home is a specific project run by Stanford's chemistry department. Its purpose is to run simulations of protein folding, in order to gather data about the folding process and to analyse related phenomena. I can't go into a full explanation (partly because I want to keep this concise, but moreso because I don't fully understand myself -- all I know is what little I've read online and extracted from my dad over the dinner table), but the research is directly linked to studies of diseases like Alzheimer's, Parkinson's, and even some cancers.
There's so much more to say about it, but I'm probably not the best person to say it. A wealth of information is available on the Folding@Home website, including instructions on downloading and using the Folding client. The instructions might be a bit technically involved, though -- so if any of you are interested but can't quite figure out how to make it work, you can ask me any questions and I'll be happy to try to help.
This post is a preliminary one, in which (I hope) I've clearly outlined the concept and mechanics of distributed computing in general, and Folding@Home in particular. I'll follow up in a few days with another post on some of the broader ideas and issues I've been reflecting on in relation to this project.
Notes:
[1] Ctrl+Alt+Delete -> Task Manager (unless Ctrl+Alt+Delete already gets you to Task Manager), and then click on the tab labelled "Performance".
[2] Now, granted, when your computer isn't working as hard it doesn't use as much electricity; and so when a distributed computing program is running on your computer, the end result is a higher power draw out of the wall socket. Still, and even if power usage and associated costs (financial and otherwise) are higher, there's a huge difference in efficiency -- a distributed computing program will make use of way more of the energy that your computer is eating up than your average, mostly-idle usage will.
haha, that was probably one of the most involved desription of distributed computing i've read in a while... when describing stuff like this, we usually tend to draw analogies to personal life scenarios (ie. you've got 25 math questions that take a minute each. you hand out one question to 25 different kids and get the answers back). which in a nutshell parallels exactly what F@H does (it even parallels well with the overhead involved in intial distribution decisions, networking speeds, overhead cost, etc.)
Hint: You can get to task manager with:
Ctrl + Shift + Esc
haha, that was probably one of the most involved desription of distributed computing i've read in a while... when describing stuff like this, we usually tend to draw analogies to personal life scenarios (ie. you've got 25 math questions that take a minute each. you hand out one question to 25 different kids and get the answers back). which in a nutshell parallels exactly what F@H does (it even parallels well with the overhead involved in intial distribution decisions, networking speeds, overhead cost, etc.)
Hint: You can get to task manager with:
Ctrl + Shift + Esc
Haha I really only talked about the mechanics of DC in one sentence -- I figured the concept was straightforward enough to be understood. I was less concerned with the strictly technical aspect and a lot more concerned with the socio-economic (for lack of a better umbrella term) aspect, which I think is talked about a lot less.
leave a response