Software Research and the Industry

Dirk Riehle’s blog about everything computer science, applied and more

Software Research and the Industry header image 3

Wiki Creole Grammar, Schema, and Transformations

Please find below an EBNF grammar, XML schema, and XSLT transformations for Wiki Creole, currently in version 1.0. Wiki Creole is the first (and only) community standard for wiki markup.

These specifications were taken from the following two technical reports:

Wiki Creole EBNF grammar

Wiki Creole XML schema definition

Wiki Creole to XHTML transformation

XML to Wiki Creole transformation

It is pretty likely that despite our test suites, some bugs remain. So, if you find some, please let us know!

The best way to get in contact with us is to address Martin Junghans through the SourceForge WikiCreole project and cc: Dirk Riehle.

14 Comments

14 responses so far ↓

  • 1 marko // Jan 21, 2008 at 10:08 am

    The EBNF links are broken: From the first one “wiki-creole/” must be removed, from the latter the quotation marks at the end.

  • 2 Dirk Riehle // Jan 21, 2008 at 10:21 am

    @marko: Fixed, thank you.

  • 3 Martien // Feb 12, 2008 at 8:42 pm

    Generating code using ANTLRWorks 1.1.7 results in errors in rule text_bolditalcontent.

    I am a absolute novice on the subject and appreciate your help.

  • 4 Martin // Feb 12, 2008 at 10:27 pm

    @Martien
    ANTLRWorks does not always show the same behavior as ANTLR does. I couldn’t use ANTLRWorks for development as the grammar size magnified. Further you’ll probably have to scale up the size of heap memory for code generation, e.g. use the JVM option -Xmx1024M.

  • 5 Martien // Feb 13, 2008 at 12:14 pm

    Thanks Martin

    Problem solved when generating from the command line using org.antlr.tool.

  • 6 V // May 2, 2008 at 6:28 pm

    Im trying to implement wikicreole using ocamlyacc or menhir (an LR1 parser for OCaml). But Wikicreole’s EBNF is using some ANTLR specific features (I think only this input.LA thing) that makes difficult to adapt to another parser generator … Does somebody have an idea about that?

  • 7 Martin // May 2, 2008 at 7:57 pm

    V, first you have to consider that ANTLR generates LL(*) parsers, in contrast to your LR(1).
    We had to use the explicit input.LA statements since ANTLR chooses a derivation too early, i.e. even if other derivations are not wrong in respect to the current look-ahead. It does not increase the look-ahead as long as necessary to identify the only applicable derivation. Thus using the wrong derivation will result in an exception, even though the grammar is correct.

    The _grammar_ does not need these explicit queries to the look-ahead. But the _ANTLR grammar specification_ seems to need it auxiliary.
    If I recall LR(1)-parsers correctly, you can’t rely on the grammar specification for your parser generator only, because the WikiCreole language specification needs more look-ahead than 1.
    Nevertheless, good luck for your work!

  • 8 Greg // May 15, 2008 at 11:42 pm

    Hi guys. I’m interested in using this XML format as my intermediate (stored) stage. However, it seems that I’ll have to write the code (Java) to translate from Creole to ‘X-Creole’. I realise there’s an XSD, but would some of you have some example XML files for me to test with? Thanks!

  • 9 Martin // May 16, 2008 at 5:13 am

    Hi Greg, you are right. We do rely on some code to transform a representation of the wiki after parsing to the XML representation. I’m not sure whether you can transform the parse tree (or AST) to XML by ANTLR’s domain specific language. We implemented a Java DFS parse tree walker that generates the XML file simultaneously. Regards, and I mail some files to you.

  • 10 Dirk Riehle // May 16, 2008 at 9:23 am

    Greg, Martin: I had been wondering about this for a while—we knew that the tree walker is missing from our publications, but then you can’t really publish some Java code.

    What you can do, of course, is create an open source project. So for that reason I created a SourceForge project called wikicreole a while back. Is there any interest on writing Tree Walker code for that project?

  • 11 Jose Chillan // Aug 11, 2008 at 6:18 pm

    Hi,
    I am very interested in your work. I went to the sourceforge project and I saw no file was committed. Is there anywhere I could get the
    Tree Walker for the ANTLR grammar that produces the XML.
    Regards

  • 12 Martin // Aug 13, 2008 at 10:51 pm

    Hi Jose and everyone,
    The SourceForge Project Wiki Creole Parser is available. It generates scanner and parser from an ANTLR grammar file, parses a wiki page and transforms the parse tree to the XML interchange format.
    Please don’t hesitate to contact me with any questions or comments.

  • 13 Varun // Oct 8, 2008 at 6:28 pm

    Hello all,

    I had one question, might sound trivial to you. I would like to know whats the different between “with_extension” and “without_extension” in above files..?

    Second, Martin you had mailed (see above comments) Greg some XML documents he asked for, can you mail me the same, please?

    Thanks!

  • 14 Dirk Riehle // Oct 12, 2008 at 10:51 am

    The difference between with_extension and without_extension is that the latter is a grammar (and related files) for the plain original Wiki Creole while with_extension is prepared by way of an additional token to be extended with new syntactic elements. So if all you want is what Wiki Creole 1.0 can give you, use without_extension, if you are thinking about adding your own new syntactic elements try using with_extension. The differences are minimal, just the hook for the extended syntax.

    As to the files, I’ll ask Martin, I think we might have to clean up this page; maybe you can already find what you are looking for at sf.net/projects/wikicreole?

Leave a Comment