UserPreferences

CdataVsEntities


A common suggestion is that an application of XML choose to require that CDATA marked sections be used instead of entities as an escaping mechanism.

It should be noted that a large number of XML processing systems do not provide for that distinction. XML parsers will equally read CDATA or entities and report the same character data to processors. XML writing libraries will only sometimes provide the ability to select whether character data is represented using CDATA marked sections or entities; sometimes they use heuristics to automatically select one or the other.

The [WWW]XML specification states, "CDATA sections may occur anywhere character data may occur; they are used to escape blocks of text containing characters which would otherwise be recognized as markup." It may be used to escape blocks of text, but it has no significance with regard to what is considered character data.

Conforming applications of XML should not make requirements that CDATA marked sections be used in place of entities.


From a previous version of EchoExample:

    <content type="text/html" mode="escaped" xml:lang="en">
      <body>
        &lt;p&gt;Hello, &lt;em&gt;weblog&lt;/em&gt; world! 2 &amp;lt; 4!&lt;/p&gt;
      </body>
    </content>

    <content type="text/html" mode="cdata">
      <body><![CDATA[ <p>Hello, <em>weblog</em> world! 2 &lt; 4!</p> ]]></body>
    </content>

cdata mode vs escaped mode

[JimDabell] What is the difference between these two modes? They seem to solve exactly the same problem. Is this another attempt to bloat the format to cater to regexp parsers?

See also NormanWalsh's [WWW]What's Wrong With Necho 0.1 in which he suggests removing <content> from the core entirely and renaming <summary> to <description>.

[MartinAtkins] Every XML parser I've seen returns both of these identically to the application, so I don't see the value in specifying one or the other in our format. The choice is outside the scope of Atom -- it's an XML issue.


CategorySyntax