cxlang is an experimental language with the goal of providing unlimited flexibility. Specifically, the compiler has no implicit understanding of the target language, but instead accepts a full language grammar on startup. This accomplishes three goals:
- The compiler is language-agnostic. Any source file (and even structured binary files) can be parsed as long as an appropriate grammar is defined.
- The compiler is runtime-agnostic. Any output format can be generated, including bytecode, machine code, source code and analyses of the input.
- A grammar may be self-modifying, meaning that it may be possible to switch between languages within a single compilation unit, or to extend the initial language grammar via escape mechanisms in the source code.
It is the third goal that is the underlying basis for the cxlang project. It is common to hear programmers discuss the advantages and weaknesses of the various mainstream programming languages. Despite good ideas being tabled, it is rare that someone will make a serious attempt at improving a language, and it is extremely rare that such an attempt will receive mainstream attention. A more common occurrence is that language features will be (ab)used to cover any lapse; this approach works well to a point, but there are many real-world examples of this and each carries significant downsides.
cxlang documentation
The following documents provide detailed coverage of the cxlang project.
cxlang usage
While the language is still in development, there are a number of sub-projects planned which may help understand the goals and nature of cxlang.
Top-level Processors
The following are “top-level” cxlang grammar. These would typically be included into the parser’s state by build rules prior to the source code being parsed. To be clear: each of these are grammars written IN cxlang, rather than part of the cxlang parser itself.
- C macro preprocessor – Takes a C program as input and emits a preprocessed token stream. A prototype exists which is capable of processing the mac system headers and at least a subset of Boost. No formal test-cases exist and the processor is likely not standards compliant at the current time. Performance is sufficient for prototyping but currently not sufficient for real work.
- C++11 syntax parser – Takes a token stream from the preprocessor and creates a parse tree. This is in early development but progress should be rapid.
- C++ syntax emitter – Takes a C++ parse tree and converts it back to C++ source. Since we currently lack an actual compiler implementation, this is the simplest way to prove that everything is working.
- Javascript syntax emitter – Takes a C++ parse tree and converts it to Javascript source. This should allow a valid C++ program to be compiled to Javascript, albeit without the C standard library.
- Escaping from C++ – Adds some custom escaping syntax to the C++11 syntax parser which allows the parser grammar to be modified in realtime from within the C++ source. This is where things start to get interesting.
- C++ source annotation – Uses the preprocessor and syntax parser to annotate a C++ source file, providing an IDE with the ability to understand the purpose of each token.
- C++ style re-flow – Takes an annotated C++ source file and re-flows it to suit a particular coding standard.
Escaped C++
The following are implemented as escaped C++ to ensure that the escaping mechanism is suitable without requiring modifications to the C++11 syntax parser. They also serve a useful purpose in their own right.
- C++ reflection – This extension provides C++ functions to query the runtime type of objects, allowing techniques such as calling a member on an anonymous pointer.
- Source doc generator – Extends the preprocessor and syntax parser to watch for function comments and emits the processed documentation.
- C++ smart pointer – Implement a native smart pointer type which provides similar runtime performance benefits to ARC.
- C++ public_cast – Implements a ‘public_cast<>’ mechanism whereby C++ code can ‘illegally’ access private or protected members. Useful for printf() debugging, for example.
- C++ dynamic arguments – Allows the specification of a “safe” variadic function. The compiler creates a C++ object with the parameters and passes it to the variadic function which may then use C++ reflection to inspect and extract the parameters.
- C++ no-op statement – Creates a ‘__noop()’ statement whose parameters are parsed but generate no runtime code. Where the parameter expressions would have side-effects, a warning is generated. Useful for ASSERT() statements which are compiled out in release builds, for example.
- C++ static analyser – Allows the specification of entry and exit conditions for functions, and class member requirements. Specifications may include C++ expressions, thread-access rules, etc. Implicit specifications can be formed from the source itself, considering array bounds, control flow expressions, and so on. As best possible, ensures that the source code never violates such rules.
- C++ runtime analyser – Extends the static analyser and reflection mechanisms to add runtime debug checks to the compile tree. Configurable to trade off runtime performance versus safety.