err.no Git - sope/blob - sope-gdl1/GDLContentStore/README

   1 Storage Backend
   2 ===============
   3
   4 The storage backend implements the "low level" folder abstraction, which is
   5 basically an arbitary "BLOB" containing some document. The feature is that
   6 we extract "quick access" / "searchable" attributes from the document content.
   7
   8 Further it contains the "folder management" API, as named folders can be stored
   9 in different databases.
  10 Note: we need a way to tell where "new" folders should be created
  11 Note: to sync with LDAP we need to periodically delete or archive old folders
  12
  13 Folders have associated a type (like 'calendar') which defines the query
  14 attributes and serialization format.
  15
  16 TODO
  17 ====
  18 - fix some OCS naming
  19   - defaults
  20   - lookup directories
  21 - hierarchies deeper than 4 (properly filter on path in OCS)
  22
  23 Open Questions
  24 ==============
  25
  26 System-meta-data in the blob-table or in the quick-table?
  27 - master data belongs into the blob table
  28 - could be regular 'NSxxx' keys to differentiate meta data from
  29
  30 Class Hierarchy
  31 ===============
  32
  33   [NSObject]
  34     OCSContext                  - tracking context
  35     OCSFolder                   - represents a single folder
  36     OCSFolderManager            - manages folders
  37     OCSFolderType               - the mapping info for a specific folder-type
  38     OCSFieldInfo                - mapping info for one 'quick field'
  39     OCSChannelManager           - maintains EOAdaptorChannel objects
  40
  41   TBD:
  42   - field 'extractor'
  43   - field 'value' (eg array values for participants?)
  44   - BLOB archiver/unarchiver
  45
  46 Defaults
  47 ========
  48
  49   OCSFolderInfoURL - the DB URL where the folder-info table is located
  50     eg: http://OGo:OGo@localhost/test/folder_info
  51
  52   OCSFolderManagerDebugEnabled      - enable folder-manager debug logs
  53   OCSFolderManagerSQLDebugEnabled   - enable folder-manager SQL gen debug logs
  54
  55   OCSChannelManagerDebugEnabled     - enable channel debug pooling logs
  56   OCSChannelManagerPoolDebugEnabled - debug pool handle allocation
  57
  58   OCSChannelExpireAge       - if that age in seconds is exceeded, a channel
  59                               will be removed from the pool
  60   OCSChannelCollectionTimer - time in seconds. each n-seconds the pool will be
  61                               checked for channels too old
  62
  63   [PGDebugEnabled] - enable PostgreSQL adaptor debugging
  64
  65 URLs
  66 ====
  67
  68   "Database URLs"
  69
  70   We use the schema:
  71     postgresql://[user]:[password]@[host]:[port]/[dbname]/[tablename]
  72
  73 Support Tools
  74 =============
  75
  76 - tools we need:
  77   - one to recreate a quick table based on the blob table
  78
  79 Notes
  80 =====
  81
  82 - need to use http:// URLs for connect info, until generic URLs in
  83   libFoundation are fixed (the parses breaks on the login/password parts)
  84
  85 QA
  86 ==
  87
  88 Q: Why do we use two tables, we could store the quick columns in the blob?
  89 ==
  90 They could be in the same table. We considered using separate tables since the
  91 quick table is likely to be recreated now and then if BLOB indexing
  92 requirements change.
  93 Actually one could even use different _quick tables which share a common BLOB
  94 table.
  95 (a quick table is nothing more than a database index and like with DB indexes
  96  multiple ones for different requirements can make sense).
  97
  98 Further it might improve caching behaviour for row based caches (the quick
  99 table is going to be queried much more often) - not sure whether this is
 100 relevant with PostgreSQL, probably not?
 101
 102 Q: Can we use a VARCHAR primary key?
 103 ==
 104 We asked in the postgres IRC channel and apparently the performance penalty of
 105 string primary keys isn't big.
 106 We could also use an 'internal' int sequence in addition (might be useful for
 107 supporting ZideLook)
 108 Motivation: the 'iCalendar' ID is a string and usually looks like a GUID.
 109
 110 Q: Why using VARCHAR instead of TEXT in the BLOB?
 111 ==
 112 To quote PostgreSQL documentation:
 113 "There are no performance differences between these three types, apart from
 114  the increased storage size when using the blank-padded type."
 115 So varchar(xx) is just a large TEXT. Since we intend to store mostly small
 116 snippets of data (tiny XML fragments), we considered VARCHAR the more
 117 appropriate type.