Concurrent Datastructures with good Performance
Short description: Concurrent Bag and Concurrent Priority Queue with good Performance.
Additional info: http://hackage.haskell.org/trac/summer-of-code/...
Proposal: Implement 2 Concurrent Datastructures in Haskell
or alternatively if the above is not possible:
My Personal Goals for this Project:
Write parts of a software library fit for general usage.
Gain experience in writing productive software.
Improve my knowledge and skills in several areas, including Functional Programming, Parallel Programming, Software Verification, Testing software for Performance, and Coding skills in general.
I am doing a Seminar presentation in Concurrency and Parallelism, and would choose Concurrent Datastructures as Topic, which would include my work for this project.
Of course doing a presentation referring to your own work is more fun, than just rehashing papers.
The Project primarily entails:
1. Implementing a Concurrent Bag:
I will implement a Lock Free Concurrent Bag according to this paper:
I have also considered to implement instead:
But I am at the moment not sufficiently convinced of this paper.
2. Implementing a Priority Queue
I have not yet decided which proposal for a Concurrent Priority Queue I would implement.
I would consider, Quantizing Queue, Lock Free Skip List or others.
I would generally try to implement the Data Structures with the best Performance and Scalability available.
I would consider changing the above Proposal to implement other Data Structurs in case of greater need or performance.
The Project would benefit the wider Haskell community insofar that they will hopefully get some Concurrent Datatypes of sufficient quality, to be used in Parallel Software.
It is necessary to have such Datastructurs with better performance and scalability, the reason to implement them nativly in Haskell is because using ffi does not give sufficient performance for technical reasons.
The tasks allocated for the coding phase amount to about nine weeks,
but with plenty optional tasks available.
For the pre coding phase I would do some further research into Concurrent Datastructurs, and look performance and correctness tests.
Do some community bonding, and sharpen my haskell skills.
-Priority Queue (15-20 Days)
-Concurrent Bag (15-20 Days)
-The Datastructures are well documented, like in an standard library
-Unit Tests for both Structures (7-10 Days) these would be of the same scope as the ones used here:
I would try to reuse existing testcode as far as possible, possible also by using testcases via FFI.
I'd like to use large numbers of automatically generated testcases, like using quickcheck and smallcheck
I would test single thread first then multithreaded.
(I would Implement Unit Tests First, then the Datastructure requiring the least amount of Code then the other)
-should there be bugs or unfinished tasks remaining, I would provide a description of those.
- A small Benchmark comparing with libcds on a multiprocessor machine,
the implemented Datastructurs should compare adquatly in performance and scalability.(10 Days)
-I would also blog about this project weekly (5 Days)
-A larger benchmark on an machine with more processors (like 24) (10 Days)
- Formal Verification of one of the Datastructures, in case state explosion would allow that.(Time unknown, presumably large)
In my opinion, Concurrent Datastructures are really a good target for formal verification.
I did talk to some of the people at my university, about that.
I would assert the possibility to do this in the first two weeks of the pre coding phase.
In my opinion delivering a datastructure that works correctly, but does not have the expected performance, or is not tested sufficiently to be considered reliable would be of little value.
In case there should be problems or time issues, y fallback plan is to deliver only one Datastructure, but well tested and documented.
So that under all conditions my project would yield at least one usable datastructure.
So I would do multiple datastructures, mostly but not strictly one after another, and if there should be problems by the half of the coding period, adapt my planning accordingly.
-I have just submitted an patch to scummvm:
for this problem:
-I took a course in Advanced Functional Programming in Haskell recently, this did include a Team Project. The Task was to write a write a Bot in Haskell for playing Lambda the Gathering.
-In another exercise I did implement an LTL Modelchecker in Scala, using bounded Modelchecking using an existing SAT-Solver.
- I took several courses in Model-Checking, so I have a good understanding of Temporal Logic and how to describe the Properties of a Concurrent Datastructure.
I did take courses about parallelism both from an abstract perspective, like parallel algorithms, arrays of processor nodes, to a more technical level, scheduler, cache, threads, locks etc.
-I have created a Prototype of an Webbrowser, P3-Nessi, witch is operated by a Brain Computer Interface using P300 Potentials. It is designed to enable locked in Patients ( meaning unable to make any voluntary Movement in most cases suffering from ALS). It was based on Research into P300 Potentials, Brain Computer Interfaces and an earlier prototype.
The repository is available under:
(As a note I have yet to add opensource licenses to some of the files, so it is not yet license compliant, I am going to fix that).
For an overview see the manual:
There is a complete release available at:
(although this is not the latest code)
My Task in this Project was chiefly a software engineering one, of mating an existing User Interface (using BCI) with an existing browser.
This has been the subject of my Bachelor Thesis.
Interaction with the community, how to get help:
I am on the #haskell irc channel with the nick: Mathias
I read the haskell cafe and beginners mailinglist.
Should I get into trouble with basic haskell programming I would ask on the beginners list or in the irc channel.
I would start an Blog on Planet Haskell and post weekly.
I would post the availablity of the Datastructures on haskell cafe (if that's Ok)
I would make a presentation, and put it online.
As said I am doing a seminar at my university, and I could get help with some conceptual problems here.
I have and would post on the Trac page for the task.
I would ask if there are specific wishes, into what datastructures to implement.
I would assume that some people would be interested to possibly use the work that I would create, otherwise there would be little point in doing it.
I consider doing a little exercise first, to get in shape.
B.Sc. in Computer Science, minoring in Psychology at the University Tübingen.
I am currently studying for a masters degree.
I have worked as a research assistant, coauthoring this ITU-T Workshop presentation:
My contribution was to design and execute a series of experiments in witch probands were to rate the quality of compressed audio files.
I did work on some small datamining tool as an exercise, that I might publish.
I did also work as a teaching assistant running a training group for a lecture in cryptology.
I have also served as a councilor and an executive officer for the student council of the WHO student village in Tübingen.
Why I am the best person to tackle this project:
-I never said that, but I am very certain that I can pull it of.
-Even in case that I don't get the GSoc Slot I am thinking of doing at least one data structure.
-I have some academic training regarding all aspects of project, so it is unlikely that I will fail du to a lack of understanding the theoretical background,
I can research, and understand papers about Concurrent Datastructures,and if something does not work I can figure out why.
-I rarely make mistakes when implementing an algortihm from a reference, and work reasonable fast.
-On the negative side I don't have much experience on coding in haskell, but I do like functional programming.
-I did make some meaty comments on the trac page.