Attempting to read and write emProject files as XML

Segger Embedded Studio project files appear to be XML formatted with a custom DOCTYPE.

<!DOCTYPE CrossStudio_Project_File>
<solution Name="ble_app_MA_lightbulb_pca10040_s132" target="8" version="2">
  <project Name="ble_app_MA_lightbulb_pca10040_s132">
    c_preprocessor_definitions="DEBUG; DEBUG_NRF"
    gcc_optimization_level="None" />



Private xmlDoc As XmlDocument = New XmlDocument()
xmlDoc.PreserveWhitespace = True
xmlDoc.XmlResolver = Nothing

Once the XML is loaded, an attribute such as c_user_include_directories can be read using this syntax

 sIncludes = xmlDoc.SelectSingleNode("solution/project/configuration[@c_user_include_directories]").Attributes("c_user_include_directories").Value


However, writing out an this XML document object back to a file has a few problems. The DOCTYPE tag changes in that additional square brackets [] appear at the end due to a bug in the .NET framework, so you get <!DOCTYPE CrossStudio_Project_File []> which although is valid XML, causes SES to say that the file is not a valid project. 

The workaround is to create a new XML Document object and replace the Document Type with a new one, with Nothing (in VB.NET, equivalent to null in C#) for the subset object, as distinct from an empty string.

Dim n As XmlDocument = New XmlDocument()
n = xmlDoc.Clone()

Dim parent As XmlNode = n.DocumentType.ParentNode
'4th param in CreateDocumentType has to be Nothing to avoid getting [] in DOCTYPE, which is what you get by default and with empty string
parent.ReplaceChild(n.CreateDocumentType("CrossStudio_Project_File", Nothing, Nothing, Nothing), n.DocumentType)


Using the replaced DOCTYPE leaves a space after the CrossStudio_Project_File but SES copes with this, and is a minor cosmetic effect. But a worse problem is that the original project file uses a mixture of attribute each on their own line and in-line attributes, which the .NET Framework forces to consistency. For example, the configuration attributes Name, arm_architecture etc are on their own lines:


but the files in folders are in-line

<folder Name="nRF_Log">
      <file file_name="../../../../../../components/libraries/experimental_log/src/nrf_log_backend_rtt.c" />
      <file file_name="../../../../../../components/libraries/experimental_log/src/nrf_log_backend_serial.c" />
      <file file_name="../../../../../../components/libraries/experimental_log/src/nrf_log_backend_uart.c" />
      <file file_name="../../../../../../components/libraries/experimental_log/src/nrf_log_default_backends.c" />
      <file file_name="../../../../../../components/libraries/experimental_log/src/nrf_log_frontend.c" />
      <file file_name="../../../../../../components/libraries/experimental_log/src/nrf_log_str_formatter.c" />

If the in-line attributes are written back on their own lines, this does not affect SES ability to read the project in correctly, but still leads to a problem.

Attribute line formatting problem

Although there are PreserveWhitespace, Indent and NewLineOnAttributes properties of XMLDocument, no combination of these can reproduce the inconsistent formatting of the original project file. My use case was to change only a specific line, leaving the rest of the project file unchanged in a way that diff tools (WinMerge) will only see a change to that line (the c_user_include_directories attribute).

Therefore it may be easier to treat the emProject file as simply lines of text rather than XML if we wish to write back a file which should be as unchanged as possible for those lines which were not intentionally changed.

Lesson: Sometimes the right thing to do is treat XML files as just lines in a file instead of using the built in .NET XML handlers which can overcomplicate things.