Storage

Shared Devices

If Haiku were to expand and generalise on the popular concept of file and printer sharing there exists the potential for sharing of arbitrary devices, both physical and virtual, over any network. Devices would appear no different from a user's view when compared with local devices.

Details

Provider

On connection of a shareable device the user's computer would first ensure the device is allowed to be shared and if so, who is allowed to access it. The device would be kept in a list of shared devices which are externally queryable.

Consumer

Automatic querying remote devices should allow for sensible and user definable parameters to ensure that it can efficiently obtain locally published devices without excess resource/time usage (where locally is context dependent, Eg, LAN, WAN, VPN, Favourite IPs, etc).

In order to use a remote device it would be internally mounted on the devfs under a uniquely named, per computer, folder. Entries would be populated under this directory in a similar fashion to local devices. Eg "/dev/[uniqueid]/camera/cam0". Thus the device would appear no different from local devices. Data directed at a remote devfs entry would be sent via network and handled remotely.

Potential schemes that have been suggested for the unique ids are listed below. There must however exist a method to translate whatever ID scheme has been chosen into a user understandable and referenceable format.

Name Advantages Disadvantages
IP Easily obtainable, scalable Volatile
MAC Easily obtainable, unique No inter-network capability
PGP Unique, secure Additional overhead/complexity

Drivers

When a device has been acquisitioned the client computer would be required to ensure that it had the appropriate driver's installed for use of the hardware. This could be accomplished via a driver server or a generic file server system.

Issues

  • Do local buses make assumptions regarding bandwidth, latency or quality of service? If so, does that impede the system?
  • How will security be handled?
  • Is there a way to exploit the potential emergence of a P2P network out of queryable shares?
  • What method should be used for publication? Should it be a single universal method, or multiple methods for different scenarios?
  • Should client polling (good for small networks) or server notification (larder networks) be used to indicate change state of a device?

Resources

Layered File System

The ability to mount filesystem's directory structure 'over the top of' existing mounted filesystems would allow for a number of simplifications for Haiku users' systems.

Details

Traditionally mounting a filesystem on a specific directory results in the new filesystem 'overriding' the specified directory, effectively hiding the original directory's contents. This restriction could however be lifted, allowing for layered, or union, mounting of systems.

When mounting filesystems the system will keep an ordered list of mount points. When file/directory/etc access occurs the system will traverse the mount list attempting to find a matching filesystem object in each in turn. The first match found is the one on which the system will act. Therefore, if there are two (or more) matches for a specific object the system first on the mount list will be used.

If the operation being attempted is that of file creation the system should create the file on the first read-write system found.

Usage

For Haiku users the system would keep a user specific set of directories with which it could overlay the system directories. Eg, A user may desire management of their own font directory. This directory could be mounted during login over the system font directory, thus removing the need for management of separate search paths.

The system would allow much freedom in areas traditionally protected due to permission constraints also. For example, installation of new software for a specific user is generally a difficult operation without resorting to increased privileges. However with layered file systems a user would have a (possibly blank) application directory mounted over the system application directory. Again doing away with multiple search paths and keeping the system much more cohesive.

Issues

  • More specifics on handling of specific layers and filesystem operations?
  • Does this system obfuscate the system internals where they are working adequately already?

File User Polymorphism

File systems could benefit by adding functionality which would allow files to behave differently based upon which user is currently reading them. Objects could have different contents, permissions and attributes.

Details

A file/user polymorphic file system would give file system objects the ability to behave differently, or effectively be different, for each user. It would allow all file information to be selected dynamically based upon entirely upon the current user. In more general terms, a file which is polymorphic could appear completely different, in terms of data content and meta-data, when viewed by different users. The only detail necessarily the same would be the file path.

A short, and incomplete, list of file information that could possibly be managed via this process is,

  • Contents, i.e. The actual file data
  • Attributes
  • Permissions

Foreseen difficulties mainly lie in the area of administration. It is difficult to determine how an administrator would be able to modify a given user's files if it was polymorphic. This could affect operations such as backups negatively. Also, depending on the specific implementation, it has been suggested that there could be difficulties in allowing multiple users access to the file system.

A possible implementation that has been suggested is to implement polymorphic symlinks whose target changes depending on the current user. This would fit nicely with existing file system methodologies and hence be one of the most transparent implementations.

Primarily the benefits of polymorphic file systems lie in the reduction of complexity from the user and application's perspective. They no longer need to take into account whether the system is multi-user or which user is logged in. There is a single file hierarchy for all cases which, without application or user intervention, changes to suit the scenario. However care must be taken to not hard code too many paths. Functions such as find_directory (located in the storage kit) should still be used to find specific directories for application usage as much as possible.

Direct uses are many and varied. The most popular of which is the ability to have the user's home directory managed via a polymorphic entry. This would also simplify the implementation of quick user switching similar to XP and Panther. Also, application settings files could be stored with the application itself, simplifying the copying of the entirety of a given application and its environment.

Resources

DevFS Attributes

Storing information about devices as attributes in the relevant areas of the devfs could allow much simpler (and more powerful) access to device specific data.

Details

There are many items of information regarding physical and virtual devices that require specialised APIs for querying. In order to implement this system an additional set of driver hooks could be added to retrieve the data. It has been suggested that using this method, driver complexity and extensibility (regarding PCI IDs) could be improved.

It would be possible, if the required functionality were in place and attributes present, to utilise a kernel enforced naming convention for devfs publication. Eg, "this is a IDE raw disk node, on the 1st bus master device" could be used to create a kernel defined devfs path.

Applications

Many applications exist such as,

  • Video card frame buffer address
  • Manufacturer ID and device string
  • Ethernet MAC address

In addition to these examples the use of queries would provide an extremely powerful user interface to the hardware information.

Cross Referenced Files

This proposal suggests an attribute based system to label files as referencing other files in order to allow queries to utilise inter-file relationships.

Details

Currently files whose contents reference other files on the local system can only be found through interrogating the file's data or through other similar mechanisms requiring some insight into the file data. It is proposed that a system wide set of attributes be used in order to build relationships.

Each file which is referred to is given a unique ID which remains constant across all filesystem operations (specifically moving) in order to preserve relationships. To create the relationship a file will have a special 'ParentID' attribute set which contains this key. In order to find all file relationships a query can be constructed which matches these keys.

This system allows for more general tree structures of references. An application can thus utilise queries to work its way up the tree finding required relationships as needed. The application of this proposal would be best benefited through modifications to Tracker in order to allow visualisation of this tree structure; The primary benefit is that of more logical data organisation without resorting to specialised applications. A great example of this at work would be email 'conversation trees'.

Issues

  • Distinct key types or one key type was never decided. Ie, whether to use recipient and sender IDs or just generic IDs
  • How are keys generated?

Calculated Attributes

Extending the OpenBFS/BFS via the usage of calculated attributes for arbitrary files and file types would provide a useful means of dynamic and constantly up to date meta-data on many types of files.

Details

On top of the current methods to define and maintain file attributes under OpenBFS/BFS, there could exist methods to define calculated attributes for arbitrary files and file types. The calculation may be based on other attributes of the same file, globally defined values or value lists and other special information like item count for queries or folders.

Importantly, when a value which a calculated attribute uses changes there must be a method to recursively trace all dependent calculated attributes and re-calculate any affected formulae in order to keep all attributes up to date. In the case of attributes which have been indexed, the appropriate indices must also be updated at the same time.

Such calculated attributes may be defined on a per-file basis (in the form of a specially flagged attribute, which indicates it has to be evaluated as opposed to just its formula string used as value); or in the MIME DB for various file types.

In the latter case all files of the according type would "inherit" them. This would be possible because in order for a file to have a value for such a calculated field, the file does not have to specifically provide it, as opposed to normal attributes which have to provide a specific value. Calculations for which some source attribute are missing would take the concrete value of , or maybe an empty value (NULL).

An implementation of this system would need to be aware of problems relating to recursive calculated attributes and enforce restrictions such that all calculations are acyclic. Many modern DBMS with similar functionality have overcome this problem.

Potential uses for such a system are numerous. Backward compatible attribute names could become useful. E.g. An attribute called "Subject" which takes on the value from MAIL:subject or NEWS:subject or from the file name, depending on whichever one exists. Another obvious usage is in digital media, where many file types have independent mechanisms for meta-data storage. An abstracted method to gather information from MP3, OGG, WMA, etc would prove invaluable for media developers and users alike.

Fildirute

Fildirutes are an idea for file system naming and simplification. Rather than having separate APIs for files, directories and attributes (and indices too!), you have a single fildirute kind of thing. You can read data from it like a file. You can also open it like a directory and see what's inside. The things inside are fildirutes too. A small one inside would be like an attribute, the name would tell you what the intended purpose was.

Example

If you had a document called "MyDoc", then reading data from it would give you the document text. Looking inside it as if it were a directory, you'd see "MyDoc/mime-type", which if you opened it and read the data, would give you "text/plain". There could also be things like "MyDoc/Thumbnail" which would have "MyDoc/Thumbnail/mime-type" containing "image/png", and "MyDoc/Thumbnail/Width" containing 32 in binary, and for completeness, "MyDoc/Thumbnail/Width/mime-type" would contain "number/int32".

Eg,

MyDoc
MyDoc/mime-type
MyDoc/thumbnail
MyDoc/thumbnail/mime-type
MyDoc/thumbnail/width
MyDoc/thumbnail/width/mime-type

Of course, under the hood the small fildirutes are stored much like current attributes are. Some of them are also dynamically generated, and don't really exist, like the "mime-type" ones. It's just that the API is simplified so that everything is accessible as a fildirute. There is only one "open" function in the API, and just a plain "read" for data, and a "readdir" to find contained things. And "close". And that's it.

Syndicate content