Difference between revisions of "ARK2/Files"

From ARK
Jump to: navigation, search
(File Management)
(File Management)
 
(4 intermediate revisions by the same user not shown)
Line 6: Line 6:
 
* The file system is abstracted using [http://flysystem.thephpleague.com/ FlySystem] allowing files to be transparently saved locally or to cloud providers.
 
* The file system is abstracted using [http://flysystem.thephpleague.com/ FlySystem] allowing files to be transparently saved locally or to cloud providers.
 
* File versioning is supported.
 
* File versioning is supported.
* Image manipulation is abstracted using [https://github.com/avalanche123/Imagine Imagine] which supports ImageMagick or GD.
+
* Image manipulation is abstracted using [http://glide.thephpleague.com/ Glide] and [https://github.com/avalanche123/Imagine Imagine] which supports ImageMagick or GD.
 
* Thumbnails are stored separately from the master files to make backups, archiving, and regeneration easier
 
* Thumbnails are stored separately from the master files to make backups, archiving, and regeneration easier
 
* Multiple thumbnail profiles can be configured
 
* Multiple thumbnail profiles can be configured
 
* Thumbnails will be generated asynchronously, so can either be queued for batch processing, delayed until a suitable time window, or only generated when needed.
 
* Thumbnails will be generated asynchronously, so can either be queued for batch processing, delayed until a suitable time window, or only generated when needed.
  
The Media Types and File Types supported are configurable, with standard types provided by default:
+
The Media Types and File Types supported are configurable, with the [https://www.iana.org/assignments/media-types/media-types.xhtml IANA standard types] provided by default:
 
* Image (jpg, png, etc)
 
* Image (jpg, png, etc)
 
* Audio (mp3, etc)
 
* Audio (mp3, etc)
 
* Video (mkv, etc)
 
* Video (mkv, etc)
 
* Text (txt, etc)
 
* Text (txt, etc)
* Document / Application (pdf, odt, ods, etc)
+
* Application (pdf, odt, ods, etc)
* Spatial?
 
* Other
 
  
 
In multi-tenant mode, the files for each ARK instance are stored separately under the 'sites' folder, e.g. 'sites/mysite/files'.
 
In multi-tenant mode, the files for each ARK instance are stored separately under the 'sites' folder, e.g. 'sites/mysite/files'.
Line 26: Line 24:
 
  - files
 
  - files
 
   |- download
 
   |- download
 +
      |- <user>
 
   |- upload
 
   |- upload
 +
      |- <user>
 
   |- tmp
 
   |- tmp
 +
      |- <user>
 
   |- data
 
   |- data
 
       |- <mediatype>
 
       |- <mediatype>
 
         |- <tokens>
 
         |- <tokens>
   |- thumbs
+
   |- cache
 
       |- <profile>
 
       |- <profile>
 
         |- <mediatype>
 
         |- <mediatype>
 
             |- <tokens>
 
             |- <tokens>
  
where the <mediattype> is as defined in the configuration, and the <token> is a configurable token to split the folders into manageable sizes. By default the <token> will be split by file id in groups of 1000, so subfolders for 0, 1000, 2000, 3000, etc.
+
where the <mediatype> is as defined in the configuration, and the <token> is a configurable token to split the folders into manageable sizes. By default the <token> will be split by file id in groups of 1000, so subfolders for 0, 1000, 2000, 3000, etc.
  
 
The individual file names will be in a standard format. Flysystem abstracts folders and file names as a relative path, which will take the form:
 
The individual file names will be in a standard format. Flysystem abstracts folders and file names as a relative path, which will take the form:
 
  files/data/<mediatype>/<token>/<id>.<revision>.<suffix>
 
  files/data/<mediatype>/<token>/<id>.<revision>.<suffix>
  files/thumbs/<profile>/<mediatype>/<token>/<id>.<revision>.<suffix>
+
  files/cache/<profile>/<mediatype>/<token>/<id>.<revision>.<suffix>
  
 
There will be no direct file or image links, all files are considered private and requests to the mapped URL will be security checked. The mapped URL forms will be:
 
There will be no direct file or image links, all files are considered private and requests to the mapped URL will be security checked. The mapped URL forms will be:
Line 47: Line 48:
  
 
A custom Flysystem manager is provided that allows different filesystems to be transparently mounted at different points in the file path. For example, the root files/ path is mounted locally, but files/data/ is mounted on Amazon S3.
 
A custom Flysystem manager is provided that allows different filesystems to be transparently mounted at different points in the file path. For example, the root files/ path is mounted locally, but files/data/ is mounted on Amazon S3.
 +
 +
== Metadata ==
 +
 +
Full metadata is supported in line with Dublin Core.
 +
 +
== Versions ==
  
 
== Design Work ==
 
== Design Work ==
  
File Manage needs to be flexible and fast.
 
 
* File attachments to data items
 
 
* Data downloads / exports
 
* Data downloads / exports
* Documentation
+
* Documentation files
* Temp files
+
* Temp files with expiry
 
* Mapping files
 
* Mapping files
* Image management, generated thumbnails, etc
 
* Metadata, mimetypes, etc
 
* Cloud storage
 
* Efficiency
 
* Security
 
 
Problems with current setup:
 
* All files in one directory, can be slow for large volumes
 
* Thumbnails in same directory, slows performance, harder to maintain
 
* No versioning
 
* No expiry
 
 
Full document management and versioning workflow is a stretch goal needed for Avalon. Try use CMIS standard as used in LibreOffice.
 
* https://www.alfresco.com/cmis
 
* https://packagist.org/packages/dkd/php-cmis
 
 
Need to plan for moving to advanced model while keeping simple for first release.
 
* Files are a system module, with modtype for core file type and attributes as required
 
* Required module properties for title and description, all else optional, some core fields replicated on item table to performance
 
* Use [http://flysystem.thephpleague.com/ FlySystem] for file system abstraction, extend to allow subpaths to be transparently mounted on.
 
* Use [https://github.com/avalanche123/Imagine Imagine] for image handling
 
* Split files into subdirs with max number of files per dir
 
* Thumbnails into separate dir, managed by image code (on-the-fly creation, etc)
 
* Support versioning
 
* Support expiry date for tmp files
 
* Support mimetypes + core types (image, video, audio, document, text)
 
  
 
Very similar to http://documentation.concrete5.org/developers/working-with-files-and-the-file-manager/overview
 
Very similar to http://documentation.concrete5.org/developers/working-with-files-and-the-file-manager/overview

Latest revision as of 15:51, 29 April 2017

File Management

File and Media management have seen major changes in ARK2, with a more flexible system implemented.

  • Files are now a core Module instead of a dataclass and look-up table, allowing for flexible metadata based on file type, and using files as either data properties or relationships.
  • The file system is abstracted using FlySystem allowing files to be transparently saved locally or to cloud providers.
  • File versioning is supported.
  • Image manipulation is abstracted using Glide and Imagine which supports ImageMagick or GD.
  • Thumbnails are stored separately from the master files to make backups, archiving, and regeneration easier
  • Multiple thumbnail profiles can be configured
  • Thumbnails will be generated asynchronously, so can either be queued for batch processing, delayed until a suitable time window, or only generated when needed.

The Media Types and File Types supported are configurable, with the IANA standard types provided by default:

  • Image (jpg, png, etc)
  • Audio (mp3, etc)
  • Video (mkv, etc)
  • Text (txt, etc)
  • Application (pdf, odt, ods, etc)

In multi-tenant mode, the files for each ARK instance are stored separately under the 'sites' folder, e.g. 'sites/mysite/files'.

The directory structure for each instance is as follows:

- files
  |- download
     |- <user>
  |- upload
     |- <user>
  |- tmp
     |- <user>
  |- data
     |- <mediatype>
        |- <tokens>
  |- cache
     |- <profile>
        |- <mediatype>
            |- <tokens>

where the <mediatype> is as defined in the configuration, and the <token> is a configurable token to split the folders into manageable sizes. By default the <token> will be split by file id in groups of 1000, so subfolders for 0, 1000, 2000, 3000, etc.

The individual file names will be in a standard format. Flysystem abstracts folders and file names as a relative path, which will take the form:

files/data/<mediatype>/<token>/<id>.<revision>.<suffix>
files/cache/<profile>/<mediatype>/<token>/<id>.<revision>.<suffix>

There will be no direct file or image links, all files are considered private and requests to the mapped URL will be security checked. The mapped URL forms will be:

files/<id>/<name>
files/<id>/<basename>_<profile>.<suffix>

A custom Flysystem manager is provided that allows different filesystems to be transparently mounted at different points in the file path. For example, the root files/ path is mounted locally, but files/data/ is mounted on Amazon S3.

Metadata

Full metadata is supported in line with Dublin Core.

Versions

Design Work

  • Data downloads / exports
  • Documentation files
  • Temp files with expiry
  • Mapping files

Very similar to http://documentation.concrete5.org/developers/working-with-files-and-the-file-manager/overview