Tag based file system using FUSE

The usual way of organizing files on disk is based on directories. This prevents one from having the same file in several directory. TaggedFS provides a new way of organizing your files based on tags.

Description

In TaggedFS each file holds set of tags. Files are accessed through a dynamic tree based on tags. Advantage a file can thus be accessed through many locations depending on the context of use. Drawback is that file names must be uniques and tag based organization is very different from the usual directory based organization.

Being a file system you can use it through your usual file handling tools (file explorer, samba mounts, command line…).

Motivation

Before starting this project, I first had a look at existing implementation. But was not very happy with them. I first tryied to implement it using the MySQL database but it prooved to be not very efficient. In addition, having two different storage system (file system and database) seemed to be very dangerous. So this implementation relies only on the file system and indices you can rebuild out of a single file organisation.
As a fall back, you can restore all your files from the file system (storage directory).

Design

Files are stored in the following structure
dir/files_structure/%02x/%02x/%02x/%02x/%02x/%02x/%02x/%02x/filename
So there is on directory per file which is name after a 64bits id splitted in subpaths MSB is first. In addition,
dir/files_structure/%02x/%02x/%02x/%02x/%02x/%02x/%02x/%02x/
contains a file named ‘tags’ which contains one tag per line. As you see you can easily retrieve your file if something goes wrong.

Tags structure is stored in dir/tags_structure/tag/subtags

The current id is stored in dir/id. File file is updated whenever you create a new file.

Indexes are kept in dir/indexes. Index is based on Key/Value and is splitted into several files spread in directoryes.

Let say Key is 0xAABBCCDDXX.
Organisation is:
/index/AA/BB/CC/idx and idx contains the last byte and value sequentially.

Searching files having a set of tag can thus be performed recursively through the sub directories of each tag index and limit the amount of data to read.

Tag organisation

You can create and organize your tags in /tags_structure directory. In this directory, tags are represented by directories. For instance you can create the following tag structure:
/tags_structure/Family/
/tags_structure/Family/Mum
/tags_structure/Family/Dad
...

In this folder you can rename, move or remove tag the way you would do for directories. Files are updated accordingly. You can’t create a file in this directory. It is dedicated to tag organization.
In addition, you can create a new tag in the /tags directory. This new tag will be created with no parent tag.

File organisation

Files are accessible from the /tags directory. Let say picture.jpg as tag ‘Mum’. The file will be accessible from the following paths:
/tags/Family
/tags/Mum
/tags/Family/Mum

The path like /tags/tag1/tag2/tag3 contains files having:
tag1 AND tag2 AND tag3
If tag21 and tag22 are subtags of tag2. Then the same path contains files having:
tag1 AND (tag2 OR tag21 OR tag22) AND tag3.

Creating a file

You create files in the /tags directory. Be aware that filename must be unique on the whole file system and can’t have the same name as a tag. If you create a file in /tags/tag1/tag2/tag3. The files will hold tags tag1,tag2 and tag3 after creation.

Setting tags to a file

Move the file to the path containing all the tags you want. This will removes
all the tags from the file and set the new ones.

Remove a tag from a file

Unfortunately, they is no way of removing a tag. You must move the file as described above and set the whole set of tags.

Add a tag to a file

Move the file from it’s current location in /tags to /tags_structure/the_tag_you_want_to_add. The file won’t be moved but the tag will be added.

Reading a file’s tags

Along each file, there’s an hidden file containing the tags of a file you can read. Just append “.tag” to the filename.
Example:
# cat /tags/Mum/picture.jpg.tag
returns
Mum

What you can’t do with taggedfs

Be aware that since /tags structure is dynamic you can’t perform recursive operations on it. This directory contains all the possible combinaison of tags and the combinatory can be huge.
For instance you can’t do:
- find /tags -name "picture.jpg" -print
- rm -rf /tags/*
- tar cvf tags.tgz /tags/*

Perfom backup and restore

In order to backup your data stored in taggedfs you should backup the storage directory including the ‘id’ file. The indexes directory is optionnal since you can rebuild it.
Restore the storage directory and rebuild indexes.

This entry was posted in Informatique. Bookmark the permalink.

10 Responses to Tag based file system using FUSE

  1. Marco P says:

    Thank you! It was exactly what I was looking for during the last week… Consider that I was almost going to switch from gnome to kde only because of dolphin tagging features (via nepomuk)!

    Your solution has the great advantage of being portable and desktop environment indipendent. Last Saturday I spent the whole day in the developing of something very similar to TaggedFS but using hard links and a deamonized bash code (I should study the fuse libraries…): I gave up because it was very very slow!

    Byez!

  2. Pingback: TagFS, tracking progress in the field of semantic file systems | Zen of Linux

  3. amir beygi says:

    Hi
    i can’t start using this lovely application . and after
    #taggedfs /mydir/
    i can’t access my mydir anymore and got this error message
    cd: _tag: Transport endpoint is not connected

  4. lordikc says:

    Hi,

    You lack the -o dir=/xxx option. See README and usage.
    Command is :
    taggedfs /mount_point -o dir=/storage_dir

    Regards,

    Lordikc

  5. amir beygi says:

    thank for your response
    now i am unable to make my test directory in tags_structure
    root@amir-laptop:/x/ttag/m# cd
    remove_tag/ tags/ tags_structure/
    root@amir-laptop:/x/ttag/m# cd tags_structure/
    root@amir-laptop:/x/ttag/m/tags_structure# mkdir Test
    mkdir: cannot create directory `Test’: No such file or directory
    root@amir-laptop:/x/ttag/m/tags_structure#

  6. lordikc says:

    Well I’ve just tryied it myself and it works

    [lordikc@home ~]$ mkdir /tmp/tagstore
    [lordikc@home ~]$ mkdir ~/mnt
    [lordikc@home ~]$ taggedfs ~/mnt -o dir=/tmp/tagstore
    [lordikc@home ~]$ cd ~/mnt/tags_structure/
    [lordikc@home tags_structure]$ mkdir Test
    [lordikc@home tags_structure]$ ls
    Test/
    [lordikc@home tags_structure]$ cd
    [lordikc@home ~]$ fusermount -u ~/mnt

    I guest you lack access rights somewhere.

  7. amir beygi says:

    hi
    my problem was
    taggedfs ~/mnt/ -o dir=/tmp/tagstore/
    instead of
    taggedfs ~/mnt -o dir=/tmp/tagstore

    i have some idea on tagfs, could you please send an email me in private..

  8. amir beygi says:

    Another question, do you know any upgrade for nautilus? i am using your software to manage my huge amount of movie/ebook/picture/documents [about 3TeraByte]

  9. amir beygi says:

    Hi again
    Are you working on this project to improve it?
    I started the pilot system , and i thing it would be nice if had these more feature:

    1. placing a !! before a tag directory to find files that have not this tag, for example when i am in a folder and tagging my files, it may i forget that does the family_mom.jpg file is located in /mom/ tag or not so i can do /!!mom/ to find files without mom tag.

    2. when i am in a folder /a/b//d/
    when i move my file from location /a/b/ to e/ folder, it will remove [d] tag and add [e] tag, NOT GOOD

    3. when you open ->edit->save your file from a parent tag, it will be removed from child tag. NOT GOOD

    4. it would be good to handle duplicate file names,

    5. it would be good[but very hard] to store your tag information in file’s same folders, so i would not change my directory structure, and even add a tag to my folder, for example add [picture][personal] tag to my /personal/picture/ folder and all files accept [picture][personal] tag automatically.

  10. lordikc says:

    Sorry,

    I no don’t work activelly on this project. It was a proof of concept. But you can download the source code and work on it.

    Regards,

    Lordikc