Difference between revisions of "ARK2"
(→Security) |
(→Framework) |
||
Line 86: | Line 86: | ||
=== Framework === | === Framework === | ||
− | It is proposed to implement a new RESTful Request/Middleware/Response | + | It is proposed to implement a new RESTful Request/Middleware/Response skeleton using a Front Controller model and token-based security, based on an external micro-framework and components adhering to the PSR standards and managed via Composer. This will reduce the amount of code maintained internally, update the code-base to modern web-app design principals, and provide a degree of future-proofing by allowing switching of components. |
− | Choosing a full framework at this point would force refactoring all of the model and view code at the same time, but by initially building our own light-weight controller framework using PSR-compliant components we can migrate the model and view later. Once all parts are migrated, a full framework could be considered if required. A full framework would also impose a heavy overhead and steeper learning curve, albeit with less code required to be written. | + | Choosing a full framework such as Symfony or Zend at this point would force refactoring all of the model and view code at the same time, but by initially building our own light-weight controller framework using PSR-compliant components we can migrate the model and view later. Once all parts are migrated, a full framework could be considered if required. A full framework would also impose a heavy overhead and steeper learning curve, albeit with less code required to be written. |
Options for micro-frameworks or component eco-systems include: | Options for micro-frameworks or component eco-systems include: | ||
* [http://www.slimframework.com/ Slim] with extra components | * [http://www.slimframework.com/ Slim] with extra components | ||
* [http://silex.sensiolabs.org/ Silex] based on [http://symfony.com/ Symphony] components | * [http://silex.sensiolabs.org/ Silex] based on [http://symfony.com/ Symphony] components | ||
− | |||
* [https://zendframework.github.io/zend-expressive/ Zend Expressive] or components joined by [https://zendframework.github.io/zend-stratigility/ Zend Stratigility] | * [https://zendframework.github.io/zend-expressive/ Zend Expressive] or components joined by [https://zendframework.github.io/zend-stratigility/ Zend Stratigility] | ||
+ | |||
+ | Frameworks or User Management skeletons considered but rejected include: | ||
+ | * [https://lumen.laravel.com/ Lumen] based on [https://www.laravel.com/ Laravel] components - requires an Active Record ORM | ||
* [http://usercake.com/ UserCake] - Very basic user management skeleton, no repo, not worth looking at | * [http://usercake.com/ UserCake] - Very basic user management skeleton, no repo, not worth looking at | ||
− | * [http://www.userapplepie.com/ User Apple Pie] | + | * [http://www.userapplepie.com/ User Apple Pie] - a UserCake fork using own Nova Framework, probably support issues |
* [http://www.userfrosting.com/ User frosting], a UserCake fork with RBAC user management, using Slim2, SBAdmin2, use for ideas | * [http://www.userfrosting.com/ User frosting], a UserCake fork with RBAC user management, using Slim2, SBAdmin2, use for ideas | ||
Line 109: | Line 111: | ||
Significant and reliable sources of components include: | Significant and reliable sources of components include: | ||
+ | * [https://packagist.org/ Packagist], the Composer index | ||
+ | * [http://symfony.com/components Symfony] | ||
* [http://thephpleague.com/#packages PHP League] | * [http://thephpleague.com/#packages PHP League] | ||
* [https://zendframework.github.io/ Zend] | * [https://zendframework.github.io/ Zend] | ||
* [http://docs.sylius.org/en/latest/components/index.html Sylius], components from an e-commerce platform based on Symfony | * [http://docs.sylius.org/en/latest/components/index.html Sylius], components from an e-commerce platform based on Symfony | ||
+ | * [http://auraphp.com/ Aura] | ||
=== Security === | === Security === |
Revision as of 18:14, 28 May 2016
This page details the progress on development of ARK 2.0
Contents
Aims
The primary aims of ARK2 are:
- Separate the ARK Database backend from the ARK Web frontend
- Implement a modern RESTful API to allow other frontends and apps to access and update the ARK Database
- Simplify the setup and configuration of ARK by moving the config into the database and providing online config tools
- Improve the overall performance and data integrity of ARK
Features
Front Controller model
- URL paths independent from source code paths for greater security and flexibility
- Most pages generated using common page layout code from config and data stored in database
- Page roles allow for switching of generated page based on user role, module, etc
- Local custom pages separated from core source and configurable by page role
User Authentication
- Internal Authentication via password
- External Authentication via OAuth2 providers (Facebook, Google, etc)
User Authorisation
- Role Based Access Control (RBAC) using hierarchical Roles and Permissions structure
RESTful API
- Modern RESTful API to access and update all ARK data
HTML5
TWIG templates and Bootstrap front-end
Design
High level design decisions for ARK2.
Technical Standards
ARK will only actively support platforms that are actively supported by their maintainers. ARK may work on earlier versions but this is not guaranteed.
- HTML5 will be used
- PHP: A minimum of v5.6 will be supported (5.6 is in Security Support, 5.7 in active support, see http://php.net/supported-versions.php), v7 will be supported.
- MySQL/MariaDB v5.5 (lowest supported MySQL)
- PostgreSQL and SQLite will be provided for using a database abstraction layer, but not initially not officially supported
- mod_rewrite will be required
- All files will be UTF8 using UNIX LF
Development Standards
The PHP-FIG standards will be used:
- PSR-1 and PSR-2 Coding Standards
- PSR-3 Logging Interface for interchangeable logging objects
- PSR-4 Auto-Loading Standard
- PSR-7 HTTP Message Interface for interchangeable Request/Response objects
PSR-3 and PSR-7 allow mixing and matching of component libraries from different vendors, and supports future-proofing by allowing switching between libraries with minimal code changes.
PSR-4 will be used for packaging, namespace and auto-loading of OO code. A good series of articles explaining PSR-4 and modern development and packaging in general can be found at the following:
- http://culttt.com/2013/01/07/what-is-php-composer/
- http://culttt.com/2014/03/12/build-php-package/
- http://culttt.com/2014/05/07/create-psr-4-php-package/
In consequence:
- Composer will be required for dependency management and PSR-4 auto-loading
- All new external libraries will be installed by Composer under vendor/ and not libs/
- All new OO classes will be namespaced under LPArchaeology\ARK\
- All new OO code will be under src/ and not php/ (this will also clearly separate new code from old)
Components will be carefully chosen to be well supported, stable, and interchangeable wherever possible.
Database Abstraction
Currently, PDO is used to directly access only MySQL databases, and DB access statements are widely spread through the code base and manually assembled. Adding support for other databases such as Postgres or SQLite would require considerable work (while PDO abstracts the connection, it doesn't abstract the SQL dialect). It also makes migration to proper transaction support and performance improvements difficult, and is a security risk due to programmer error. A Database Abstraction Layer (DAL) can abstract away the differences in SQL between database systems, and also provide Query Builders, Schema Management, and Migration tools to address the other issues. Most are built on PDO and can seamlessly integrate with legacy code to make for an easier migration path.
Longer term, full OO code, most frameworks, and many components use an ORM to map relational data to objects. A key part of choosing a framework or component eco-system is the ORM it uses. Most ORMs however use the Active Record pattern which cannot map onto the existing ARK data model. ARK would require a Data Mapper ORM to access the legacy database structure. While using multiple ORMs would be possible, it is not recommended.
Doctrine ORM is the only PHP Data Mapper available, and is built on the Doctrine DBAL DAL. Doctrine is widely use and under active development, being the main ORM for the Symfony eco-system as well as many independent components. DBAL also provides the full set of required Drivers, Query Builder and Schema Management tools to abstract access to the required databases.
Migration process:
- Move all PDO / SQL type calls into a set of common utilities in db_functions, i.e. insert, update, etc, that take arrays of fields and values, etc
- Change all db functions to use new common utilities
- Move all db functions into db_functions
- Migrate new common utilities to DBAL
- All new table creation to use DBAL
Framework
It is proposed to implement a new RESTful Request/Middleware/Response skeleton using a Front Controller model and token-based security, based on an external micro-framework and components adhering to the PSR standards and managed via Composer. This will reduce the amount of code maintained internally, update the code-base to modern web-app design principals, and provide a degree of future-proofing by allowing switching of components.
Choosing a full framework such as Symfony or Zend at this point would force refactoring all of the model and view code at the same time, but by initially building our own light-weight controller framework using PSR-compliant components we can migrate the model and view later. Once all parts are migrated, a full framework could be considered if required. A full framework would also impose a heavy overhead and steeper learning curve, albeit with less code required to be written.
Options for micro-frameworks or component eco-systems include:
- Slim with extra components
- Silex based on Symphony components
- Zend Expressive or components joined by Zend Stratigility
Frameworks or User Management skeletons considered but rejected include:
- Lumen based on Laravel components - requires an Active Record ORM
- UserCake - Very basic user management skeleton, no repo, not worth looking at
- User Apple Pie - a UserCake fork using own Nova Framework, probably support issues
- User frosting, a UserCake fork with RBAC user management, using Slim2, SBAdmin2, use for ideas
Read the following for further info:
- http://symfony.com/doc/current/book/http_fundamentals.html
- http://symfony.com/doc/current/book/from_flat_php_to_symfony2.html
- http://symfony.com/doc/current/create_framework/index.html
The ARK root folder will contain only the index.php file which will act as a dispatcher, receiving all Requests, matching the Route and dispatching them to the correct Controller. Each ARK page type and the api will have a Controller to read the model and construct the view before returning the Response. This will allow future flexibility for new request formats while still being able to support persistent legacy links. It will also allow for database config and user auth driven routing, e.g. one install may only expose the RESTful API, while others may only expose read-only pages.
Besides the core HTTP and Routing modules, the Security, Translations, Forms, Validation, and Console components should be considered.
Significant and reliable sources of components include:
- Packagist, the Composer index
- Symfony
- PHP League
- Zend
- Sylius, components from an e-commerce platform based on Symfony
- Aura
Security
ARK currently uses PEAR LiveUser for user authentication and authorisation, but this hasn't been updated since 2010. It is a security risk, and also lacks many features like federated login. The ARK API currently uses plain text user and password in the request URL which is insecure. ARK2 will require a new security solution, especially for the API calls from client apps.
Requirements
- User Authentication
- Token-based
- Local user database for stand-alone/internal use
- Via OAuth and OpenID authentication services (Google, Facebook, etc)
- User Authorisation
- Role-Based Access Control (RBAC) model based on Users/Roles/Permissions
- API authentication via token and secure login
- HTTPS will be required
- Use LetsEncrypt to obtain SSL certificates
- Anonymous/Unauthenticated User access as optional Role for both Web and API
- A migration path from LiveUser must be provided.
Any solution chosen will work best when integrated with the other framework components chosen and should be implemented in parallel as it is highly dependent on the Request/Response/Routing/Session components used.
The Symfony Framework provides a very powerful Security component, but not a simple all-in-one solution meeting our requirements. Combining a number of external components may be able to meet our requirements, at the cost of more custom code required.
- Use Symfony\Security\Guard to manage the Authentication process
- Use League\OAuth2-Client or Opauth or HWIOAuthBundle for external OAuth2 authentication
- Use League\OAuth2-Server or FOSOAuthServerBundle for OAuth2 server for API
- Use Sylius\RBAC or FOSUserBundle for User/Role management
The combination of HWIOAuthBundle / FOSOAuthServerBundle / FOSUserBundle is widely supported and more 'native' to Symfony, but requires the use of the full framework, bundles, Doctrine ORM, and YAML-based config. The alternatives are built as stand-alone interoperable PSR components and will provide greater future flexibility and a gentler migration path, but will require more work to integrate.
Alternatives such as Sentinal which provides all the required features in a single integrated component would require choosing a different component ecosystem, such as Laravel.
Possible packages:
- Sentinel - Full combined package, but requires Laravel ORM, extensions like OAuth are for-pay
- Zend Auth and RBAC
- Sylius User and RBAC
- PHP League Client and OAuth2 Server
RESTful API
A RESTful API will be implemented using best practices which are outlined in the following article:
In particular, the following rules will be applied:
- JSON will be the only format supported
- All JSON will be defined using the JSON Schema standard, which can be requested using the API and used to parse/format the JSON
- API versioning will be used to version the resource path structure, error messaging, and other API infrastructure. The actual data formats will be controlled by the JSON schema.
- Authenticated access will only be available using HTTPS, API tokens, and OAuth2
- Read-only unauthenticated unencrypted access will be supported only if explicitly enabled
- Use module name or module code, i.e. contexts vs cxt?
Two options for paths:
- api/<version>/<module>/<site>/<item> - embeds site into api, makes some browsing/queries easier but what about non-site ARKs?
- api/<version>/<module>/<item> - makes site just an attribute, keeps item completely agnostic, but makes browsing/search harder?
- Maybe both? Use cxt/MNO12_1000 for pure ARK api, contexts/MNO12/1000 for more semantic version?
The following HTTP verbs will be supported:
- GET - fetch resource
- POST - insert new resource with next id, i.e. insert a new item with next item_no
- PUT - insert or update resource with a specified id, i.e. insert a new item or update an existing item with a set item_no
- PATCH - update part of a resource, i.e. update a single field or group
- DELETE - delete a resource
- OPTIONS - What HTTP verbs the current authenticated API user can perform on a resource
Examples:
- api/<version>/<module>/<item>
- api/v2/cxt/MNO12_1000 - fetch/update/delete item MNO12_1000
- api/v2/contexts?field=value&field2=value2 - returns the search results inside contexts
- api/v2/contexts?sort=field1,field2 - returns the contexts sorted by
- api/v2/contexts?q=text - returns the free-text search results inside contexts
- api/v2/contexts/schema - returns the contexts json schema
- api/v2/schema - returns the full json schema
- api/v2/filters - returns the list of global saved filters
- api/v2/filters/123 - returns the saved filter definition
- api/v2/users/jlayt/filters - returns the the list of user filters
Notes:
- Updating a resource will require some kind of timestamp or last update key to prevent overwriting subsequent changes
- All security / OPTIONS / anon access will be controlled by user roles
Frontend
The frontend will be migrated to TWIG templates, jQuery, and Bootstrap.
There will be separation between the ARK frontend and the Admin frontend. The ARK frontend will be the dynamic generated data-driven side, configurable for every ARK. The Admin frontend will be static and consistent across all ARKs, but can be modified fro site specific requirements if needed. This separation will allow for ARK to run as a pure database/API backend server with basic admin and auth frontend provided without the user havign to configure or enable any of the web frontend.
Potential Bootstrap admin templates:
- SB Admin 2 (Test here)
- AdminLTE (Test here)
- Gentelella (Test here)
Migration
A migration process from ARK 1 to ARK 2 will be provided.
Data migration. Existing tables will need to change from MyISAM to InnoDB. Change in place carries a degree of risk of data loss if the migration fails part way. Attempting to restart failed migrations is also prone to error. To protect users data, a new database will be created with new tables and the data copied across. Should migration fail users will easily be able to roll back to their old install, or keep retrying the migration until it does succeed. In effect the ARK init script will be run, followed by the migration script.
User migration. Users will be migrated from LiveUser to the new RBAC system. This will require a compatible default user config.
Config migration. A config migration script will be provided, but may require adapting for individual ARKs.
Changes
Details of changes made in ARK2.
Code Repository
Development of ARK 2.0 is occurring in the open on GitHub https://github.com/lparchaeology/ark2
Configuration
Significant changes to the configuration of ARK are being made to move from PHP file based configuration to database based configuration. This section will document these changes.
- The config/ folder will contain all user-editable php files required, all other config will be in the database
- The env_settings.php file is replaced by server.php and paths.php
- server.php contains the settings for the database connection and root server path and should be the only file requiring editing for a default ARK install
- paths.php contains the settings for the server file paths and should not need editing
- To set-up an ARK, copy the config folder from php/arkdb/config to teh root folder and edit as required
- preflight_checks.php now defaults to off, so needs to be enabled before running, and then deleted form config afterwards
- settings.php has moved from config/ to php/settings/ and no longer requires user editing, all settings are now held in the database and should be configured per the instructions
Database
- Configuration has been moved to the database
- A new ADO class wrapping PDO has been created to provide all database access for the new OO config classes
- db_functions.php has been cleaned up to move repeated code into new routines:
- dbTimestamp() returns a timestamp
- dbRunAddQuery() inserts a single row into a table
- dbUpdateSingleIdRow() updates a single ID table row
- dbUpdateAllRows() updates all rows matching a given key
- All DB functions have been moved into db_functions.php and use the new DB routines so they no longer create SQL themselves
Globals
Globals are being progressively removed and replaced where possible by access to config objects.
A number of config global variables have been renamed for consistency
- Any var ending in _dir is an absolute filesystem directory path
- Any var ending in _path is a URL path relative to the hostname and always starts with a '/'
- Neither var ever ends in a separator
- $ark_server_path -> $ark_root_dir
- $ark_dir -> $ark_root_path and no longer ends in a /
- $registered_files_host -> $registered_files_path
- $phMagickDir -> $phmagick_file
- ark_web_maptemp_dir -> ark_maptemp_path
- $ark_lib_dir and $ark_lib_path point to the library folder
- $skins_dir and $skins_path point to the skins folder
- $skin_dir and $skin_path point to the current skin folder
A number of config global variables have been renamed for clarity
- $mode -> $search_mode
- $ftx_mode -> $search_ftx_mode
A number of config global variables have been deleted as they are not used:
- $default_year
- $conf_non_search_words
- $conf_langs
- $loaded_map_modules
- $default_output_mode
The logging globals have been changed
- $log, $conf_log_add, $conf_log_edt, $conf_log_del are deleted
- $log_ins, $log_upd and $log_del are used instead
A number of globals have been replaced by PHP5 constants
- $fs_path_sep has been replaced with PATH_SEPARATOR
- $fs_slash has been replaced with DIRECTORY_SEPARATOR
A number of config globals have been removed as they are provided through alternative means:
- $conf_pages
- $conf_media_browser ($default_media_browser holds subform_id for now)