In Gentoo Linux packages are normally built from source code. Almost every package has build-time and run-time dependencies on other packages. The dependency lists are filled by package developers. There are about 15 000 packages only in official tree and there are many unofficial trees(overlays). Gentoo operates on a so-called rolling release schedule. Packages are updated continuously and missing dependency can become a problem for the end user.
Gentoo has a feature called preserved-libs, which saves packages from breaking when runtime dependency has it version changed. I want to focus on build-time dependencies problem.
I am using gentoo for almost 3 years and updating my system often. Each 2-3 weeks one of package fails to compile due to missing dependency.
How many missing dependencies was in main tree?
$ for i in `find /usr/portage/ -name ChangeLog`; do grep -i miss $i | grep -i dep; done | wc -l
Main idea: when package builds, the build process is accessing files, that belongs to other packages. Sometimes, when this file is absent, the build fails. Lets log all file access events and watch to which packages that files belongs. After that, we will be able to automatically generate the package dependency table and compare it to ebuild's one.
I want to write two tools for package developers:
- Auto dependency builder
- Dependency checker
Auto dependency builder will collect information about a build process, in particular, the information about the files been accessed. Then it will send this information to server. Server will collect that logs and create a database: package->accessed files. Having information about all files the package consists of and a base of packages, server will build a dependency list.
Dependency checker will build a package, blocking any operations on files not belonging to one of dependency packages. If there will be the build error, then check is failed.
- Use flags
- Database organization
- Packages versions
- Package A depends on package B. Package B consists of .h files. During the build the package A having an access to package B's file xf86vmode.h. Package B is available in 2 versions: 1.0 with the file xf86vmode.h and 2.0, where xf86vmode.h was renamed in xf86vm.h. Dependency builder will be able to notify package developer about a possible breakage.
- The developer of C package forgot to include a dependency D. Assume, package D is commonly installed on most systems. Then, most users will install this package successfully, but it will be users who won't be able to build package C. The dependency checker can see this breaking.
Thoughts about an architecture
I plan to use a Python language for this project.
One of possible solutions of USE-flags problem is to analyse many log files and to find dependencies between use flags and files, accessed during build. Server is needed to unify logs with the various compiling parameters and to collect statistics. I think that a best option is to have a server that will generate XML or JSON files: [package->[files in package]] and [package->[files accessed during building this package]].
But I am afraid that these files will consume a lot of disk space on client side(3-100KB per package), so I want to give to user a choice: download full base or download on demand. I plan to use storage/query abstraction level to provide this choice.
Also, I plan to use file access logging abstraction level. Possible logging approaches:
- Use a ptrace
- Use a root file system, mounted through FUSE and chroot in it
- Use a gentoo sandbox's strategy: modfify LD_PRELOAD environment var and catch library calls
I already experimented with FUSE approach. Sebastian Pipping, a gentoo developer, kindly provided me with a working logger sample.
Before April 22
- Improve my coding skills(done)
- Do some research about logging approaches(done)
April 22 ‐ May 22
- Do experemements with logging(in progress)
- Disscuss the project with mentor
May 23 ‐ June 5
- Choose the logging approaches to use
- Create a working file access logger application
- Generate many logs for analysis
June 6 ‐ July 3
- Develop logs analyzer(server part)
- Develop auto dependency builder
July 4 ‐ July 10
- Do tests of auto dependency builder to prove that it works
July 11 ‐ July 24
- Do more tests of auto dependency builder to ensure that it works
- Develop an auto dependency checker application
July 25 ‐ August 15
- Launch the continuous checking of dependencies for packages in official Gentoo tree
- Mail developers about a missing dependencies, help them to resolve problems
- Fix a bugs
- Write a documentation
- Make an ebuild for the tool
I plan to work on this project 18:00 ‐ 24:00(GMT+5) in weekdays and 12:00 ‐ 18:00 in weekends.
I have a summer session in my univercity in first half of June.
My name is Alexander Besenev, 21. I am russian computer science student/bachelor. I am studing in Ural State University on Mathematics and Mechanics department.
- Information security. I am a part of Hackerdom team. We are organizing biggest challenge in information security among russian universities, RuCTF. We have an international version of this challenge, RuCTFE. Also, we are participating in many information security challenges. I do reverse engineering in the team.
- Distributed computations. I am administering the 20TFlops cluster in Institute of Mathematics and Mechanics, and writing my Master Dissertation about distributed file systems.
- Linux. I am using Linux as my main operating systems for 3 years. Sometimes I am fixing bugs in it for myself. I have fixed support of russian characters when flash automounts in KDE, full screen flash on intel video cards. My "add keybinding to show desktop" patch is in KDE since 4.4,
Links: firstname.lastname@example.org, twitter:http://twitter.com/alex_bers
Additional info(application template related)
Bugs fixed: 315819 and 308325
Post on mailing list: http://archives.gentoo.org/gentoo-soc/msg_e2a7ed0d7062b40fda352bceb0b7f1ab.xml
- Email: email@example.com
- Phone: +79089258984
Preferred working hours: 18:00 ‐ 24:00 in weekdays and 12:00 ‐ 18:00 in weekends(GMT+5)