Peruser Configuration Decision Tree

When the JenaConfiguredCommandMachine is used under cocoon, you have quite a few opportunities to go about customizing your application.  Even when using the simjpler machines, there are a bounty of options. 

This list identifies your most important configuration branch points, starting with the most general aspects and proceeding to the most specific.

  1. Any cocoon instance may contain any number of sitemaps, which are mounted in a delegation tree hierarchy.  The top level sitemap gets the first crack at matching inbound HTTP requests, which it may choose to delegate to any lower level sitemap, which may in turn delegate to an even lower sitemap, and so on.  In cocoon terminology, this sitemap delegation is called "mounting".
  2. Any sitemap may define any number of transformer types (using the map:transformer tag), which are available in all sitemaps below it in the sitemap hierarchy.
  3. Each transformer type is bound to a java class (through it's src attribute) which implements the cocoon transformer API interfaces.   A java class may be used by any number of separate transformer type definitions.
  4. For peruser-specific transformation behaviors, we use a single java class: net.peruser.binding.cocoon.PeruserTransformer, which you will usually not need to extend or modify.
  5. Each map:transformer type that is bound to the PeruserTransformer class has a name (the PT-type name) and additional configuration (the PT-type config) associated with it as part of the contents of the map:transformer tag.  In particular, the PT-type config will contain the description of a pm:machine instance.
  6. A pm:machine is bound to a string name called pm:cuteName and to a java class, pm:class.  It may also be bound to various other pieces of machine-specific configuration information.  You may want to create your own Machine classes and bind them into the peruser at this level, but this is not necessary for beginners and is not the only way to extend peruser+cocoon with java code.
  7. For complex behaviors, the most commonly used peruser machine class is net.peruser.binding.jena.JenaConfiguredCommandMachine, which expects a single piece of additional information at the PT-type level:   pp:assemblyURI.    When this machine class is used, we call the machine instance a PJCCM.
  8. This assemblyURI refers to an RDF resource which is known to the PeruserJenaKernel, typically (but not necessarily) because it was loaded from a configuration file at the time Peruser was booted.    This resource must be of rdfs type ja:Model (according to the Jena Assembler Definition).  This
    resource thus refers indirectly to some RDF model, called the configModel for this PJCCM.    This configModel may be fetched verbatim from a file, loaded from a database, or derived via merging and/or inference from other RDF models.
  9. The PT-type defined in step 5 (which is bound to a machine, which, if it is a PJCCM, is bound to a config model indirectly via an assemblerDescription) may be used in the construction of pipelines, via the map:transform tag, in any sitemap equal or inferior (in the sitemap delegation/mounting hierarchy) to the one where the PT-type was defined.  When it is so used, it defines a PT-instance.  The PT-instance is bound to the PT-type using the type attribute.
  10. A PT-instance is further configured using it's (required) src instruction attribute (which must be a URI) and, optionally, some name/value pairs passed using map:parameter tags.  The specification of these values may make use of cocoon sitemap substitution (expressions in curly braces within attribute values), which may rely on peruser infrastructure (e.g. SPARQL query results) through the PeruserInputModule, which is bound to the namespace prefix pim: in the peruser root sitemap.
  11. When the CocoonServlet receives an inbound HTTP request, it uses sitemap rules to determine the pipeline(s) to be executed.  When a pipeline includes a PT-instance, then the "transform()" method on the PeruserTransformer class is invoked, and passed an input XML DOM Tree (the output of the previous stage in the pipeline).  The transform() method must produce an output tree (or throw a java exception).  Here is how that transformation takes place:
    1. The bound machine is looked up or created as necessary.
    2. The bound machine is told to process(instructAddr, inDoc, env), where instructAddr is the instruction URI from step 10, inDoc is the XML DOM Tree input, and env is the computational environment.  For more details on what happens here, go down to step 12 below.
    3. This process method returns an outData block, which must be convertible to an output XML DOM Tree, which is passed to the next stage in the cocoon pipeline.
  12. Within the process() method, if the bound machine happens to be a PJCCM (the usual case), then the following steps take place:
    1. The current config model of the machine is cloned to produce a mutableConfig.
    2. The mutableConfig is asked to applyOverrides(inDoc) , giving it a chance to apply any configuration that may be in the input XML stream (which may include information from the input HTTP request, and/or previous stages in the pipeline processing). 
    3. The  method AbstractCommand.instantiateAndConfigure(env, commandConf, instructAddr) is called to construct and configure a net.peruser.core.command.Command to be executed.  The env and instructAddr here are the same ones passed in from step 11.2 above.   Within instantiateAndConfigure, the instructAddr is usually used to look up an RDF description of a command to be executed (although non-RDF configuration is also possible).  This description must contain a commandClassName, which is the name of a java class which implements the net.peruser.core.command.Command interface.     These commands are generally parts of peruser modules in which some specific functionality (e.g. model querying, model updating, database operations) is implemented.   You may construct your own modules with your own commands. 
    4. The command is told to work(inputDoc), which must accomplish the objectives of the transformation and produce output data in a form convertible to an XML DOM Tree.  The command has read/write access to all state known to the peruser system (RDF models, databases, etc.) as well as its own bound configuration information.