| |
- AttributeList
- Attributes
- ContentHandler
- DTDHandler
- DeclHandler
- DocumentHandler
- EntityResolver
-
- HandlerBase(EntityResolver, DTDHandler, DocumentHandler, ErrorHandler)
- ErrorHandler
- exceptions.Exception
-
- SAXException
-
- SAXNotRecognizedException
- SAXNotSupportedException
- SAXParseException
- InputSource
- LexicalHandler
- Locator
- Parser
- XMLReader
-
- IncrementalParser
- XMLFilter
class AttributeList |
|
*DEPRECATED*
Interface for an attribute list. This interface provides
information about a list of attributes for an element (only
specified or defaulted attributes will be reported). Note that the
information returned by this object will be valid only during the
scope of the DocumentHandler.startElement callback, and the
attributes will not necessarily be provided in the order declared
or specified.
|
| |
- __getitem__(self, key)
- Alias for getName (if key is an integer) and getValue (if string).
- __len__(self)
- Alias for getLength.
- copy(self)
- Return a copy of the AttributeList.
- get(self, key, alternative=None)
- Return the value associated with attribute name; if it is not
- available, then return the alternative.
- getLength(self)
- Return the number of attributes in list.
- getName(self, i)
- Return the name of an attribute in the list.
- getType(self, i)
- Return the type of an attribute in the list. (Parameter can be
- either integer index or attribute name.)
- getValue(self, i)
- Return the value of an attribute in the list. (Parameter can be
- either integer index or attribute name.)
- has_key(self, key)
- True if the attribute is in the list, false otherwise.
- items(self)
- Return a list of (attribute_name,value) pairs.
- keys(self)
- Returns a list of the attribute names.
- values(self)
- Return a list of all attribute values.
|
class Attributes |
|
Interface for a set of XML attributes.
Contains a set of XML attributes, accessible by name. The
attributes have no order, and only attributes that were either
actually present on the element or defaulted from the DTD declarations
will be present in the set. (Attributes declared as #IMPLIED, but not
specified in the document will not be present.)
|
| |
- __getitem__(self, name)
- Returns the value of the attribute with the given name. (Alias
- for getValue.)
- __len__(self)
- Returns the number of attributes in the list. (Alias for
- getLength.)
- copy(self)
- Return a copy of the Attributes object.
- get(self, name, alternative=None)
- Return the value associated with the attribute name; if it is not
- available, then return the alternative.
- getLength(self)
- Returns the number of attributes in the list.
- getNameByQName(self, name)
- Returns the namespace name of the attribute with the given
- raw (or qualified) name.
- getNames(self)
- Returns a list of the names of all attributes
- in the list.
- getQNames(self)
- Returns a list of the raw qualified names of all attributes
- in the list.
- getType(self, name)
- Returns the type of the attribute with the given name.
- getValue(self, name)
- Returns the value of the attribute with the given name.
- getValueByQName(self, name)
- Returns the value of the attribute with the given raw (or
- qualified) name.
- has_key(self, name)
- True if the attribute is in the list, false otherwise.
- items(self)
- Return a list of (attribute_name, value) pairs.
- keys(self)
- Returns a list of the attribute names in the list.
- values(self)
- Return a list of all attribute values.
|
class ContentHandler |
|
Interface for receiving logical document content events.
This is the main callback interface in SAX, and the one most
important to applications. The order of events in this interface
mirrors the order of the information in the document.
|
| |
- __init__(self)
- Sets the internal locator attribute to None.
- characters(self, content)
- Receive notification of character data.
- The XMLReader will call this method to report each chunk of
- character data. SAX parsers may return all contiguous
- character data in a single chunk, or they may split it into
- several chunks; however, all of the characters in any single
- event must come from the same external entity so that the
- Locator provides useful information.
- endDocument(self)
- Receive notification of the end of a document.
- The SAX parser will invoke this method only once, and it will
- be the last method invoked during the parse. The parser shall
- not invoke this method until it has either abandoned parsing
- (because of an unrecoverable error) or reached the end of
- input.
- endElement(self, name, qname)
- Signals the end of an element.
- The name parameter contains the name of the element type, just
- as with the startElement event.
- endPrefixMapping(self, prefix)
- End the scope of a prefix-URI mapping.
- See startPrefixMapping for details. This event will always
- occur after the corresponding endElement event, but the order
- of endPrefixMapping events is not otherwise guaranteed.
- ignorableWhitespace(self, chars, start, end)
- Receive notification of ignorable whitespace in element content.
- Validating XMLReaders must use this method to report each chunk
- of ignorable whitespace (see the W3C XML 1.0 recommendation,
- section 2.10): non-validating parsers may also use this method
- if they are capable of parsing and using content models.
- SAX parsers may return all contiguous whitespace in a single
- chunk, or they may split it into several chunks; however, all
- of the characters in any single event must come from the same
- external entity, so that the Locator provides useful
- information.
- The application must not attempt to read from the array
- outside of the specified range.
- processingInstruction(self, target, data)
- Receive notification of a processing instruction.
- The XMLReader will invoke this method once for each processing
- instruction found: note that processing instructions may occur
- before or after the main document element.
- A SAX parser should never report an XML declaration (XML 1.0,
- section 2.8) or a text declaration (XML 1.0, section 4.3.1)
- using this method.
- setDocumentLocator(self, locator)
- Called by the parser to give the application a locator for
- locating the origin of document events.
- SAX parsers are strongly encouraged (though not absolutely
- required) to supply a locator: if it does so, it must supply
- the locator to the application by invoking this method before
- invoking any of the other methods in the ContentHandler
- interface.
- The locator allows the application to determine the end
- position of any document-related event, even if the parser is
- not reporting an error. Typically, the application will use
- this information for reporting its own errors (such as
- character content that does not match an application's
- business rules). The information returned by the locator is
- probably not sufficient for use with a search engine.
- Note that the locator will return correct information only
- during the invocation of the events in this interface. The
- application should not attempt to use it at any other time.
- skippedEntity(self, name)
- Receive notification of a skipped entity.
- The XMLReader will invoke this method once for each entity
- skipped. Non-validating processors may skip entities if they
- have not seen the declarations (because, for example, the
- entity was declared in an external DTD subset). All processors
- may skip external entities, depending on the values of the
- http://xml.org/sax/features/external-general-entities and the
- http://xml.org/sax/features/external-parameter-entities
- properties.
- startDocument(self)
- Receive notification of the beginning of a document.
- The SAX parser will invoke this method only once, before any
- other methods in this interface or in DTDHandler (except for
- setDocumentLocator).
- startElement(self, name, qname, attrs)
- Signals the start of an element.
- The name parameter contains the name of the element type as a
- (uri ,localname) tuple, the qname parameter the raw XML 1.0
- name used in the source document, and the attrs parameter
- holds an instance of the Attributes class containing the
- attributes of the element.
- startPrefixMapping(self, prefix, uri)
- Begin the scope of a prefix-URI Namespace mapping.
- The information from this event is not necessary for normal
- Namespace processing: the SAX XML reader will automatically
- replace prefixes for element and attribute names when the
- http://xml.org/sax/features/namespaces feature is true (the
- default).
- There are cases, however, when applications need to use
- prefixes in character data or in attribute values, where they
- cannot safely be expanded automatically; the
- start/endPrefixMapping event supplies the information to the
- application to expand prefixes in those contexts itself, if
- necessary.
- Note that start/endPrefixMapping events are not guaranteed to
- be properly nested relative to each-other: all
- startPrefixMapping events will occur before the corresponding
- startElement event, and all endPrefixMapping events will occur
- after the corresponding endElement event, but their order is
- not guaranteed.
|
class DeclHandler |
|
Optional SAX2 handler for DTD declaration events.
Note that some DTD declarations are already reported through the
DTDHandler interface. All events reported to this handler will
occur between the startDTD and endDTD events of the
LexicalHandler.
To se the DeclHandler for an XMLReader, use the setProperty method
with the identifier http://xml.org/sax/handlers/DeclHandler.
|
| |
- attributeDecl(self, elem_name, attr_name, type, value_def, value)
- Report an attribute type declaration.
- Only the first declaration will be reported. The type will be
- one of the strings "CDATA", "ID", "IDREF", "IDREFS",
- "NMTOKEN", "NMTOKENS", "ENTITY", "ENTITIES", or "NOTATION", or
- a list of names (in the case of enumerated definitions).
- elem_name is the element type name, attr_name the attribute
- type name, type a string representing the attribute type,
- value_def a string representing the default declaration
- ('#IMPLIED', '#REQUIRED', '#FIXED' or None). value is a string
- representing the attribute's default value, or None if there
- is none.
- elementDecl(self, elem_name, content_model)
- Report an element type declaration.
- Only the first declaration will be reported.
- content_model is the string 'EMPTY', the string 'ANY' or the content
- model structure represented as tuple (separator, tokens, modifier)
- where separator is the separator in the token list (that is, '|' or
- ','), tokens is the list of tokens (element type names or tuples
- representing parentheses) and modifier is the quantity modifier
- ('*', '?' or '+').
- externalEntityDecl(self, name, public_id, system_id)
- Report a parsed entity declaration. (Unparsed entities are
- reported to the DTDHandler.)
- Only the first declaration for each entity will be reported.
- name is the name of the entity. If it is a parameter entity,
- the name will begin with '%'. public_id and system_id are the
- public and system identifiers of the entity. public_id will be
- None if none were declared.
- internalEntityDecl(self, name, value)
- Report an internal entity declaration.
- Only the first declaration of an entity will be reported.
- name is the name of the entity. If it is a parameter entity,
- the name will begin with '%'. value is the replacement text of
- the entity.
|
class DocumentHandler |
|
*DEPRECATED*
Handle general document events. This is the main client
interface for SAX: it contains callbacks for the most important
document events, such as the start and end of elements. You need
to create an object that implements this interface, and then
register it with the Parser. If you do not want to implement
the entire interface, you can derive a class from HandlerBase,
which implements the default functionality. You can find the
location of any document event using the Locator interface
supplied by setDocumentLocator().
|
| |
- characters(self, ch, start, length)
- Handle a character data event.
- endDocument(self)
- Handle an event for the end of a document.
- endElement(self, name)
- Handle an event for the end of an element.
- ignorableWhitespace(self, ch, start, length)
- Handle an event for ignorable whitespace in element content.
- processingInstruction(self, target, data)
- Handle a processing instruction event.
- setDocumentLocator(self, locator)
- Receive an object for locating the origin of SAX document events.
- startDocument(self)
- Handle an event for the beginning of a document.
- startElement(self, name, atts)
- Handle an event for the beginning of an element.
|
class EntityResolver |
|
Basic interface for resolving entities. If you create an object
implementing this interface, then register the object with your
Parser, the parser will call the method in your object to
resolve all external entities. Note that DefaultHandler implements
this interface with the default behaviour.
|
| |
- resolveEntity(self, publicId, systemId)
- Resolve the system identifier of an entity and return either
- the system identifier to read from as a string, or an InputSource
- to read from.
|
class ErrorHandler |
|
Basic interface for SAX error handlers. If you create an object
that implements this interface, then register the object with your
XMLReader, the parser will call the methods in your object to report
all warnings and errors. There are three levels of errors
available: warnings, (possibly) recoverable errors, and
unrecoverable errors. All methods take a SAXParseException as the
only parameter.
|
| |
- error(self, exception)
- Handle a recoverable error. (Corresponds roughly to validity
- errors.)
- fatalError(self, exception)
- Handle a non-recoverable error. (Corresponds roughly to failures
- to be well-formed.)
- warning(self, exception)
- Handle a warning.
|
class IncrementalParser(XMLReader) |
|
This interface adds three extra methods to the XMLReader
interface that allow XML parsers to support incremental
parsing. Support for this interface is optional, since not all
underlying XML parsers support this functionality.
When the parser is instantiated it is ready to begin accepting
data from the feed method immediately. After parsing has been
finished with a call to close the reset method must be called to
make the parser ready to accept new data, either from feed or
using the parse method.
Note that these methods must _not_ be called during parsing, that
is, after parse has been called and before it returns.
|
| |
- close(self)
- This method is called when the entire XML document has been
- passed to the parser through the feed method, to notify the
- parser that there are no more data. This allows the parser to
- do the final checks on the document and empty the internal
- data buffer.
- The parser will not be ready to parse another document until
- the reset method has been called.
- close may raise SAXException.
- feed(self, data)
- This method gives the raw XML data in the data parameter to
- the parser and makes it parse the data, emitting the
- corresponding events. It is allowed for XML constructs to be
- split across several calls to feed.
- feed may raise SAXException.
- reset(self)
- This method is called after close has been called to reset
- the parser so that it is ready to parse new documents. The
- results of calling parse or feed after close without calling
- reset are undefined.
|
class InputSource |
|
Encapsulation of the information needed by the XMLReader to
read entities.
This class may include information about the public identifier,
system identifier, byte stream (possibly with character encoding
information) and/or the character stream of an entity.
Applications will create objects of this class for use in the
XMLReader.parse method and for returning from
EntityResolver.resolveEntity.
An InputSource belongs to the application, the XMLReader is not
allowed to modify InputSource objects passed to it from the
application, although it may make copies and modify those.
|
| |
- __init__(self, system_id=None)
- no doc string
- getByteStream(self)
- Get the byte stream for this input source.
- The getEncoding method will return the character encoding for
- this byte stream, or None if unknown.
- getCharacterStream(self)
- Get the character stream for this input source.
- getEncoding(self)
- Get the character encoding of this InputSource.
- getPublicId(self)
- Returns the public identifier of this InputSource.
- getSystemId(self)
- Returns the system identifier of this InputSource.
- setByteStream(self, bytefile)
- Set the byte stream (a Python file-like object which does
- not perform byte-to-character conversion) for this input
- source.
- The SAX parser will ignore this if there is also a character
- stream specified, but it will use a byte stream in preference
- to opening a URI connection itself.
- If the application knows the character encoding of the byte
- stream, it should set it with the setEncoding method.
- setCharacterStream(self, charfile)
- Set the character stream for this input source. (The stream
- must be a Python 1.6 Unicode-wrapped file-like that performs
- conversion to Unicode strings.)
- If there is a character stream specified, the SAX parser will
- ignore any byte stream and will not attempt to open a URI
- connection to the system identifier.
- setEncoding(self, encoding)
- Sets the character encoding of this InputSource.
- The encoding must be a string acceptable for an XML encoding
- declaration (see section 4.3.3 of the XML recommendation).
- The encoding attribute of the InputSource is ignored if the
- InputSource also contains a character stream.
- setPublicId(self, public_id)
- Sets the public identifier of this InputSource.
- setSystemId(self, system_id)
- Sets the system identifier of this InputSource.
|
class LexicalHandler |
|
Optional SAX2 handler for lexical events.
This handler is used to obtain lexical information about an XML
document, that is, information about how the document was encoded
(as opposed to what it contains, which is reported to the
ContentHandler), such as comments and CDATA marked section
boundaries.
To set the LexicalHandler of an XMLReader, use the setProperty
method with the property identifier
'http://xml.org/sax/handlers/LexicalHandler'. There is no
guarantee that the XMLReader will support or recognize this
property.
|
| |
- comment(self, content)
- Reports a comment anywhere in the document (including the
- DTD and outside the document element).
- content is a string that holds the contents of the comment.
- endCDATA(self)
- Reports the end of a CDATA marked section.
- endDTD(self)
- Signals the end of DTD declarations.
- endEntity(self, name)
- Reports the end of an entity. name is the name of the
- entity, and follows the same conventions as for
- startEntity.
- startCDATA(self)
- Reports the beginning of a CDATA marked section.
- The contents of the CDATA marked section will be reported
- through the characters event.
- startDTD(self, name, public_id, system_id)
- Report the start of the DTD declarations, if the document
- has an associated DTD.
- A startEntity event will be reported before declaration events
- from the external DTD subset are reported, and this can be
- used to infer from which subset DTD declarations derive.
- name is the name of the document element type, public_id the
- public identifier of the DTD (or None if none were supplied)
- and system_id the system identfier of the external subset (or
- None if none were supplied).
- startEntity(self, name)
- Report the beginning of an entity.
- The start and end of the document entity is not reported. The
- start and end of the external DTD subset is reported with the
- pseudo-name '[dtd]'.
- Skipped entities will be reported through the skippedEntity
- event of the ContentHandler rather than through this event.
- name is the name of the entity. If it is a parameter entity,
- the name will begin with '%'.
|
class Locator |
|
Interface for associating a SAX event with a document
location. A locator object will return valid results only during
calls to ContentHandler methods; at any other time, the
results are unpredictable.
|
| |
- getColumnNumber(self)
- Return the column number where the current event ends.
- getLineNumber(self)
- Return the line number where the current event ends.
- getPublicId(self)
- Return the public identifier for the current event.
- getSystemId(self)
- Return the system identifier for the current event.
|
class Parser |
|
*DEPRECATED*
Basic interface for SAX (Simple API for XML) parsers. All SAX
parsers must implement this basic interface: it allows users to
register handlers for different types of events and to initiate a
parse from a URI, a character stream, or a byte stream. SAX
parsers should also implement a zero-argument constructor.
|
| |
- __init__(self)
- no doc string
- parse(self, systemId)
- Parse an XML document from a system identifier.
- parseFile(self, fileobj)
- Parse an XML document from a file-like object.
- setDTDHandler(self, handler)
- Register an object to receive basic DTD-related events.
- setDocumentHandler(self, handler)
- Register an object to receive basic document-related events.
- setEntityResolver(self, resolver)
- Register an object to resolve external entities.
- setErrorHandler(self, handler)
- Register an object to receive error-message events.
- setLocale(self, locale)
- Allow an application to set the locale for errors and warnings.
- SAX parsers are not required to provide localisation for errors
- and warnings; if they cannot support the requested locale,
- however, they must throw a SAX exception. Applications may
- request a locale change in the middle of a parse.
|
class SAXException(exceptions.Exception) |
|
Encapsulate an XML error or warning. This class can contain
basic error or warning information from either the XML parser or
the application: you can subclass it to provide additional
functionality, or to add localization. Note that although you will
receive a SAXException as the argument to the handlers in the
ErrorHandler interface, you are not actually required to throw
the exception; instead, you can simply read the information in
it.
|
| |
- __getitem__(self, ix)
- Avoids weird error messages if someone does exception[ix] by
- mistake, since Exception has __getitem__ defined.
- __init__(self, msg, exception=None)
- Creates an exception. The message is required, but the exception
- is optional.
- __str__(self)
- Create a string representation of the exception.
- getException(self)
- Return the embedded exception, or None if there was none.
- getMessage(self)
- Return a message for this exception.
|
class SAXParseException(SAXException) |
|
Encapsulate an XML parse error or warning.
This exception will include information for locating the error in
the original XML document. Note that although the application will
receive a SAXParseException as the argument to the handlers in the
ErrorHandler interface, the application is not actually required
to throw the exception; instead, it can simply read the
information in it and take a different action.
Since this exception is a subclass of SAXException, it inherits
the ability to wrap another exception.
|
| |
- __init__(self, msg, exception, locator)
- Creates the exception. The exception parameter is allowed to be None.
- __str__(self)
- Create a string representation of the exception.
- getColumnNumber(self)
- The column number of the end of the text where the exception
- occurred.
- getLineNumber(self)
- The line number of the end of the text where the exception occurred.
- getPublicId(self)
- Get the public identifier of the entity where the exception occurred.
- getSystemId(self)
- Get the system identifier of the entity where the exception occurred.
|
class XMLFilter(XMLReader) |
|
Interface for a SAX2 parser filter.
A parser filter is an XMLReader that gets its events from another
XMLReader (which may in turn also be a filter) rather than from a
primary source like a document or other non-SAX data source.
Filters can modify a stream of events before passing it on to its
handlers.
|
| |
- __init__(self, parent=None)
- Creates a filter instance, allowing applications to set the
- parent on instantiation.
- getParent(self)
- Returns the parent of this filter.
- setParent(self, parent)
- Sets the parent XMLReader of this filter. The argument may
- not be None.
|
class XMLReader |
|
Interface for reading an XML document using callbacks.
XMLReader is the interface that an XML parser's SAX2 driver must
implement. This interface allows an application to set and query
features and properties in the parser, to register event handlers
for document processing, and to initiate a document parse.
All SAX interfaces are assumed to be synchronous: the parse
methods must not return until parsing is complete, and readers
must wait for an event-handler callback to return before reporting
the next event.
This interface replaces the (now deprecated) SAX 1.0 Parser
interface. The XMLReader interface contains two important
enhancements over the old Parser interface:
* it adds a standard way to query and set features and
properties; and
* it adds Namespace support, which is required for many
higher-level XML standards.
|
| |
- __init__(self)
- no doc string
- getContentHandler(self)
- Returns the current ContentHandler.
- getDTDHandler(self)
- Returns the current DTDHandler.
- getEntityResolver(self)
- Returns the current EntityResolver.
- getErrorHandler(self)
- Returns the current ErrorHandler.
- getFeature(self, name)
- Looks up and returns the state of a SAX2 feature.
- getProperty(self, name)
- Looks up and returns the value of a SAX2 property.
- parse(self, source)
- Parse an XML document from a system identifier (as a string) or an
- InputSource.
- setContentHandler(self, handler)
- Registers a new object to receive document content events.
- setDTDHandler(self, handler)
- Register an object to receive basic DTD-related events.
- setEntityResolver(self, resolver)
- Register an object to resolve external entities.
- setErrorHandler(self, handler)
- Register an object to receive error-message events.
- setFeature(self, name, state)
- Sets the state of a SAX2 feature.
- setLocale(self, locale)
- Allow an application to set the locale for errors and warnings.
- SAX parsers are not required to provide localisation for errors
- and warnings; if they cannot support the requested locale,
- however, they must throw a SAX exception. Applications may
- request a locale change in the middle of a parse.
- The locale value must be an ISO 639 two-letter language code.
- The value is case-insensitive.
- setProperty(self, name, value)
- Sets the value of a SAX2 property.
| |