Mar 29, 2007
Its been a while since I blogged about Conduit, and a lot has been going on. My main task has been working on the conflict resolution UI and on tweaking the conversions between datatypes so as to maintain maximum fidelity. John Carr, the other main Conduit contributor has also been working on some awesome stuff.
Conflict Resolution
During the synchronization process a conflict can occur in any of the following scenarios
-
In a one way synchronization, if the destination data is newer that the source data (the exported data shoud not clobber user updated data case)
-
In a two way sync when the destination data has been modified since the last synchronization (the two computers sync to the same location but both have had source changes case)
-
In any situation when (for backend specific reasons) a comparison between data is unable to determine which is newer than the other.
-
A piece of source data has been deleted and the user has selected a synchronization policy in effect which states they want to be certain before deleting the last remaining copy of that data at the destination. (I handle delete as a special case of a conflict)
Dataprovider backends communicate conflicts to the UI via signals. The UI has access to the conflicting data so can offer the user the ability to compare which is newer. Conflicts use the same arrow metaphor that the main UI uses and can either be resolved in the main window, or be shown with more detail in their own window.
If the data has set a (gnome-open'able) URI then the user can view the data at this location and decide which is newer. For example, a user is synchronizing his Tomboy notes via an iPod. If there is a conflict in this scenario then clicking Compare will cause Tomboy to show the local note and will launch a text editor to inspect the conflicting note off the iPod. (Lazyweb request: Any idea on how I might arrange the windows side-by-side via libwnck for example).
Fidelity
The main difficulty with synchronizing different datatypes, or the same datatypes but via a middleman is maintaining fidelity (that is not losing any information) during the whole process. In the case of contact or calendar data then the default would be to support just standard datatypes (vcard, ical, etc).
-
Opensync 0.30 is looking to define their own XML schema for basic datatypes, and I would support this idea and likely adopt the same schema.
-
Apple .mac uses strict datatypes and pushes everything to a central server. One alternative to maintaining fidelity is to keep data in all formats but maintain a single canonical source of modification times, and perform conversions (even if they lose fidelity) only when needed from this central store.
This leads onto my next suggestion Having an optional central repository (like .mac) not only addresses some of the difficulties of maintaining fidelity through conversion but also opens up some other use-cases which the open source world has not addressed.
-
Online editing of tomboy notes while also being able to sync them on multiple computers (I can currently do this through backpackit.com but the conversions lose fidelity/information)
-
Synchronize desktop settings (wallpaper, theme, nautilus prefs, etc between two computers)
-
One click file sync with no configuration.
So lazywab hackers. Who here is a Django hacker that wants to write me a web service to do this. I already have a documented API that you could write against. I have the spec, the idea, but not the time. I had considered writing this myself, and setting it up as a paid service (I would pay a few dollars a month to be able to do this) but dont have the time, nor the web development experience, nor an inclination of artistic ability to make it look nice! Any takers, please email me!
Miscellany
-
John has finished up all the remaining pieces to allow iPods to be hotplugged/removed in conduit, so that synchronization settings and state are preserved.
-
He is also hacking on direct computer-computer sync over local network. Using avahi for discovery and pickling data over the wire. Should be awesome when this lands!
-
Unfortunately I was not eligible for summer of code this year, I am on holiday from university for six months and wont be starting PhD till after the summer. I was pleased to see mention of Conduit in the Ubuntu SOC ideas page, so I am glad to see distros recognizing the importance of making synchronization easy!
-
Releasing 0.3.0 is blocking on my having a working Ubuntu Fesity install to test against. (Lazyweb request: Does suspend,resume and hotkeys for panasonic sub notebooks such as the CF-R4 work on Feisty out of the box - it has not done so in previous releases)
Update: Google just released python GData bindings. This will make excellent support for Google Calendar, Google Notebook and a few other ideas I have up my sleeve a reality in the near future!. My blog is also back up, sorry for the trouble.
Mar 12, 2007
At the request of a few people I have made a demo to share, showing how I do threading in a pygtk application (Conduit). I find the following approach seems to work reliably and requires little code. Other approaches can be found here and here.
This approach takes advantage of the fact that signal emission in glib has been threadsafe since glib 2.8 (IIRC). All communication with the GUI is done via gobject signals. There is definitely a compromise in the level of GUI fiddling that you can do, particularly when compared with the threads_enter/leave approach. However, I have found this signal based approach sufficent in my case, and that it encourages me to decouple the slow blocking tasks from the GUI.
A lot of the tasks in conduit take a long time (network limited), and there is no real need for extensive GUI interaction with them once they have been started. I am really only interested in their progress, and when they complete. With this in mind I have implemented the following approach;
-
FooThreadManager
This class is the entry point for starting threads (the make_thread() method) . Its basically just a threadpool that starts threads with the appropriate arguments, while restricting the number of concurrent running threads below a user defined limit. It also connects the threads to the supplied user callbacks.
-
_IdleObject
Like a normal gobject.GObject but emits all signals in the main thread
-
__FooThread_
A simple class which derives from both threading.Thread and _IdleObject. All work is done in run() and a signals are emitted when the thread completes and to show progress
-
Demo
A simple demo (see screenshot) which can start a whole bunch of threads and receive notification when they complete.
Anyway, the Example Code is a bit contrived and will certainly need some customization by the user but nonetheless may still be useful to others.
Update: Thanks to comments I fixed up a thread-safety issue. I had misunderstood that signal handlers get run immediately in emit(). Now the code emit()s on an idle_handler so all signals and callbacks are run in the main thread. As i mentioned, I use this approach in situations where the threads may run and block for a long time, so the burden of processing all the signals in the main thread is not a big deal as they do not occur frequently.
Update 2: Added progress reporting
Mar 10, 2007
A lot has been happening on the Conduit front of late, and I probably should have blogged about it earlier. Unfortunately, it has seemed as though I have made 5 steps forward and 4 steps back with every hour of work on the project, so why, and whats up?
Background
I will be dropping a 0.3 release to coincide with the release of GNOME 2.18. This will be a developer release designed to get things working well before a stable 0.4 release some time after.
All of Conduits synchronization capability is determined at runtime. Conduit scans the users' and the system wide dataproviders directories in the same way that Deskbar loads handlers when it starts. The core synchronization logic is decoupled from the dataproviders, and only those relevant dataproviders are shown to the user (i.e. only show iPod Notes if an iPod is connected, etc).
Remember that I want Conduit to be a GNOME wide synchronization service that other applications can hook into, so I need to not only decouple the core sync logic from the application and usage specific dataprovider back ends, but also allow conduit to be extended by third parties.
Extension by 3rd Party Applications
I mentioned that all synchronization capability is determined at runtime. The other half of this story is that a third party may wish to synchronize something that Conduit knows nothing about (I'll use a Jokosher project as an example, although in my next post I will address these issues in the Tomboy case). Lets assume that a Jokosher project cannot be described in terms of the basic types of data that Conduit knows about (email, contact, file, etc). In order to allow Jokosher projects to be synchronized between different computers the Jokosher developers would need to do the following
-
Define a Jokosher datatype that describes what it is to be synchronized.
-
(optionally) Define some conversions between the Jokosher datatype and the basic Conduit datatypes (File, Text, etc)
-
Define a dataprovider that encapsulates Jokosher invocation specific information such as the Jokosher user, and how does one go about extracting/modifying Jokosher specific data when Jokosher may be running.
The Conduit framework will do a lot of heavy lifting here, such as caching modification times, providing a way for the user to configure a Jokosher sync partnership, providing a way for a user to resolve conflicts, and even allowing the whole sync process to be controlled from outside of the application over DBus. Most interestingly however, depending on the fidelity and number of conversions specified in #2, Conduit will facilitate a number of other useful sync scenarios including,
-
Sync Jokosher projects via any gnomevfs compatible location (ftp, etc)
-
Sync via Amazon S3 (coming soon)
-
Sync via USB key
-
Sync directly over a local network using Avahi. (next release)
The point here is that this decoupling of the dataproviders, from the core and the datatypes (sans the conversion functions) lets the Jokosher dataprovider take advantage of any/all additional $FUTURE dataproviders (e.g. Amazon S3) and all improvements to the Conduit core (such as direct sync using pickle'd python over network via Avahi).
Testing
As I get closer to release I get more nervous about releasing a tool which will could eat peoples data. To mitigate this risk I spent a week or two implementing an extensive testing framework for Conduit, a task I should have tackled much earlier in the project.
Unfortunately the testing is made more complicated when the applications capabilities can change between invocations (dynamic loading of dataproviders and conversions explained earlier). Aside from the usual unit level tests (does this throw the correct exceptions, etc) I am also required to test the fidelity and accuracy of the conversions between datatypes (round trip comparisons), and the fidelity of synchronization from dataprovider $FOO to dataprovider $BAR via any of these datatypes.
The testing framework uses a mixture of python and shell script, and is even web 2.0 compliant ;-). It currently tests available conversions, saving and getting data from various dataprovider back ends (Flickr, gnomevfs, Tomboy, etc). It even does code coverage and pushes the results on line.
What's Next
I apologize for this post being mostly academic in nature, lacking in screen shots. I will post again in a few days with some more relevant information (from a users perspective). I just wanted to highlight how and why I believe a desktop sync service is useful, and how I am trying to satisfy this in a sustainable way for the GNOME platform as a whole.
I anticipate finishing up the two way Tomboy support (via iPod notes, gnomevfs location, and backpackit.com) in the next few days. There are some bugs preventing running Conduit on Feisty (goocanvas 0.6.0 is ABI incompatible with 0.4.0). There are also some other lose ends (like some GUI blocking calls in conflict resolution, and some difficulties saving program configuration). Unfortunately, like a lot of FOSS, the best is always just around the corner....
Feb 19, 2007
Hey Planet GNOME'rs
Thanks Jeff for putting me on Planet GNOME. My name is John Stowers and i'm currently splitting my time between a few things that might be of interest to people here.
-
Synchronization and the GNOME Desktop
I probably spend most of my free time hacking on Conduit, which is a synchronization application for GNOME. I hope to provide a (DBus) service where application authors can use Conduit for their individual sync and export capabilities, and don't have to keep reimplementing them in their own applications. Furthermore as a stand alone application I aspire to the ease of use of Apple's .Mac while supporting core GNOME technologies.
More information
-
Conduit Website
-
Slides and Video from a talk at Linux.conf.au
-
The next release is scheduled for around the same time as the release of GNOME 2.18 and will support that favorite requested feature coughtomboy synchronizationcough. Thanks Boyd.
-
A Metadata Enabled GNOME
Im pretty excited about Tracker and the ability to move away from a folder centric GNOME. By embracing metadata, tagging, and a desktop indexer I think that we will leap past our competitors in terms of integration between applications. To this end I am hacking on;
-
Nautilus + Tracker integration
Injecting some love into the nautilus emblem functionality. Support tagging and attaching emblems (emblem = a tag with an image) from either nautilus or Tracker.
-
Nautilus metadata integration
Working with neilj on a nautilus-on-steroids + libtracker-gtk. Lets make metadata visible (in nautilus) and easy for application authors to add to their apps (libtracker-gtk)
-
GtkFileChooser + Tracker
Attach tags and notes at save time (whats the state of the filechooser extension spec?)
Other than that Im 23 years old, just finished a masters and electrical engineering at the University of Canterbury. Im really an electronics nerd so im also hacking on Albatross Unpiloted Aerial Vehicle. Im doing some traveling around Europe for the next few months and then probably starting a phD.
Feb 6, 2007
When Tracker was proposed for GNOME 2.18 I was one of its staunchest supporters, arguing that a GNOME wide unified metadata storage system would enable a richer desktop experience, and take GNOME beyond its competition. Tracker did not make the cut for GNOME 2.18, and will no doubt be proposed again for GNOME 2.20.
To help people see the potential of a GNOME desktop using Tracker I have been working on two projects for the past little while. Both of these are components of my larger vision of a metadata rich GNOME desktop. These initial attempts just focus on making tagging [1] a more consistent experience for all GNOME apps. Consider the examples and screenshots (for a file called nice) below;
-
Freedesktop emblem spec
Allows desktop file managers and indexers to present a list of predefined emblems and tags to the user, and allows these predefined emblems/tags to be installed by third party developers in a consistent manner.
-
Tracker Nautilus Integration
Nautilus using tracker for storage of all tags and emblems.
-
libtracker-gtk
A bunch of gtk widgets that application authors can use to add tracker functionalit to their application.
Industrious individuals could probably find the bzr repositories for the above work, but at the moment it is not quite ready for prime time consumption. Stay tuned for more news.
[1] Terminology:
Jan 26, 2007
Some interesting things have popped up in the FOSS world in the last few weeks. Some of you may have seen these, others may not;
Jan 25, 2007
Well I had a fantastically great time at linux.conf.au. I was fortunate enough to present two talks at the conference;
But I think the coolest thing was meeting heaps of interesting (and impressive) people including Jono, Eugene, Manish, Andy, Davyd, Karla, Nigel, Rafael, etc, etc. I will put some more photos and some of my new BHAGS for Conduit and Albatross up soon.