Skip to content

lovasoa/dia2code

Repository files navigation

Dia2Code v. 0.9.0-lovasoa

SUMMARY

This program generates code for many languages from an UML Dia Diagram.

DESCRIPTION

This program is a small utility that makes code from a Dia diagram. Supported languages are: Ada, C, C++, C#, IDL, Java, PHP(4,5), Python, Ruby, shapefiles, and SQL create statement files.

Its intended purpose is to ease the programmer's work by generating the structure of the classes in an Object Oriented language (like C++, Java and C#) from a graphical representation of them (a Dia Diagram).

STATUS

Dia2Code generates Ada, C, C++, IDL, Java, PHP, Python, C#, and Ruby files. It can also generate a file with SQL's CREATE TABLE statements and .bat files for creating shapes. Templates and packages are supported but need testing.

The basic functionality can now be considered complete. Minor releases should fix bugs and add small improvements.

The generation of # include and import directives is done considering the types used in each class but, generally, only if those types are declared themselves in the diagram. Classes are searched in the types of: attributes, parents, method's return types, method's parameters, dependencies and associations.

Dia2Code looks for Generalizations, Realizations, Implements, Dependencies and Associations and adds that information to the class list.

Packages are implemented considering the geometry information of the objects (packages and classes).

Feel free to have a try and send me your comments.

LICENSE

This program is distributed under the GNU GPL. Read the COPYING file for details.

REQUIREMENTS

I've only tested it under Linux/i386 and Win32. I have notices of successful compilation on *BSD and GNU Hurd. It should work with other platforms, but your mileage may vary.

  • Dia (I work with 0.88 but older versions may be OK). Not for the compilation part but to make some nice diagrams with it.
  • libxml2 headers and development libraries (I use 2.4.1 but it may be OK if you use any version greater than 2.0.0).
  • A C compiler.
  • automake and autoconf (actually, optional)

INSTALLATION

Usually:

    $ ./configure

    $ make

    # make install

Read the INSTALL file for details.

If that doesn't work for you, try (from the innermost dia2code directory):

    $ cc -I/usr/include/libxml *.c -O2 -o dia2code -lxml2 -ldl

    $ strip dia2code

I've modified Makefile.am and configure.in so "configure" will hopefully find everything it needs.

OPERATION

$ dia2code <parameters>

Where <parameters> can be any combination of:

-h|--help
   prints help on parameters and exits inmediately

-t (ada|c|cpp|idl|java|php|python|shp|sql|csharp)
  tells dia2code to either create C++ (cpp), Java (java), C (c),
  Ada (ada), Python (python), SQL "create table" files (sql) and
  C# (csharp).
  The default is C++.

-d <outdir>
  tells it to put the files in the specified directory.
  The default is to output files in the current directory.

-nc
  asks dia2code to not overwrite existing files.
         - Default return value for non-void methods.

-cl <classlist>
  generates code only for the classes specified in the
  comma-separated <classlist>.

-v
  inverts the class list selection.  When used without -cl,
  prevents any file from being created.

-l <licensefile>
  prepends the contents of the specified file to each source file
  generated.

<diagramfile>
  Name of the dia file (compressed or not) that contains the
  UML diagram to be parsed.

The only mandatory parameter is the diagram file name.

Note: Parameters can be specified in any order.

EXAMPLES

$ dia2code -t java test.dia

Will generate the *.java files for the classes in the test.dia diagram and put them in the current directory.

$ dia2code -nc -d ~/C++ hw.dia

Will generate the *.cpp and *.h files for the classes in the test.dia diagram and put them in the directory ~/C++. It won't overwrite any existant file.

$ dia2code -t java -cl Base,Derived test.dia

Will create the Java files only for classes "Base" and "Derived".

$ dia2code -cl B*,*Foo test.dia

Will create the *.cpp and *.h files for classes that begin with "B" or end with "Foo". NOTE: You can only specify one asterisk and either at the beginning or end of the pattern. NOTE2: You may have to quote the asterisks when typing on the shell (e.g. "\" instead of "")

$ dia2code -cl Foo -v foobar.dia

Will create C++ code for all classes but "Foo".

$ dia2code -v test.dia

Will not create any files. Don't know if it may be useful, but it surely is syntactically correct.

HOW IT WORKS

  1. Parse the diagram file with xmlParseFile().
  2. Parse the tree generated in 1 for UML classes to build an umlclasslist (type defined here).
  3. Parse the tree again for UML Generalizations, Realizations, Implements, Dependencies, Associations and packages to add information to the class list.
  4. Generate the structure of the classes (write it into files) from the class list.

Steps 1-3 are done in parse_diagram(). Step 4 is done in generate_code_*(). Both functions are called from main().

NOTES ON UML

What you should put into your diagram

These are mandatory:

  • A non-empty class name.
  • For each attribute, the name and type.
  • For each method, the name.
  • For each method's parameter, the name and type.

These are optional:

  • The stereotype of the class.
  • The default value of attributes.
  • The default value of parameters.
  • The return type of a method. If no type is declared, dia2code will output no type at all for it.

Stereotypes

In IDL, C++, and Ada, some stereotypes are supported that all begin with "CORBA". These are: CORBAConstant, CORBAEnum, CORBATypedef, CORBAStruct, CORBAUnion, CORBAValue. When generating IDL, the corresponding IDL syntax is generated for the UML class. When generating other languages, a rough mapping from the IDL to the other language is generated.

The following conventions are followed by the code generation for CORBA stereotypes:

Constant First attribute's type contains the const type; first attribute's value contains the const's value. Everything else unused. Enum Attribute names contain the enum literals, everything else unused. Typedef First attribute's type contains the original type; first attribute's value optionally contains the dimensions in case of an array. Everything else unused. Struct Attribute names and types contain the struct members. Everything else unused. Union First attribute's type contains the switch type. Only enum types are supported as the switch type. Following attributes contain the union cases as follows: Attribute value contains the switch value for the case (only exactly one value possible per case); attribute name and type contains the union member name and type for that case. Everything else unused. Value Use much like a normal class. Staticness not supported.

In C++, an abstract class is not explicitly declared as it is in Java. To have a class declared as "abstract" you have to set its "abstract" flag.

For generating Java, dia2code recognizes two stereotypes: "Interface" and "JavaBean". The former is a standard stereotype name, the latter a convention we use here. Both interfaces are only meaningful when generating Java code. If the stereotype is "JavaBean" then dia2code will output, for each attribute a of the class:

  • A "T getA();" method, with return type conformant to the type of the attribute. If the type of the attribute is "boolean" then the name will be: "boolean isA()".
  • A "void setA(T value);" method, with a paramter, "value", with type conformant to the type of the attribute.

For generating PHP, the "interface" stereotype is recognized and produces an interface rather than a class.

Note: actually, dia2code will not output these things, but it will create the methods and paste into each one the suggested implementation. These methods will be treated as the rest. The generate_code_java outputs an implementation if one's avaliable. The generate_code_cpp just happens to do the same, but I feel it is most useful when generating Java code.

Visibility

Dia2Code does not handle the "implementation" visibility for methods (yet). So when you have a class that implements a method that was declared abstract in a base class, you have to give it the same visibility of the parent class' method. The visibility of the method is not printed if it is "implementation"; this may be a source of bugs.

Method's return type

If you leave the "type" entry in the method declaration empty, then no type will be declared for it. This is useful with constructors, when the return type is implicit (both in C++ and Java). If the return type is void, you should explicitly declare it in the UML diagram. I don't know if this is a good practice, I just thought it was reasonable. Everyone is welcome to discuss it.

Packages

The UML standard states that there are (mostly) two ways of representing packages: a large box with all the elements inside (Large Package in Dia) or a small box (Small Package in Dia) to which the package's elements connect with a symbol that Dia does not (yet?) include: a line with a crossed out circle in the end of the container. You can put a normal line with a "Coordiate origin" arrow at one end, of course, and will be aestetically correct, but of no use to Dia2Code.

Dia2code, currently, checks only for intersections of packages (Large or Small) with other objects like classes and other packages, so it will be mostly useful with diagrams that use Large Packages instead of Small ones.

As of version 0.8.2, the Java, IDL, C++, and Ada code generators use package information. We should upgrade the other generators so they will generate better import/include directives. The Java code generator currently outputs all the files to the specified directory, does not create one for each package. The IDL, C++, and Ada generators generate files only for the outermost packages/classes. UML elements nested inside packages are generated as items nested with the IDL module, C++ namespace, or Ada package.

INFORMATION FOR DEVELOPERS

Code Generators:

A code generator function is a function that takes a pointer to a batch structure. The structure looks like this:

struct batch {
    umlclasslist classlist;
    char *       outdir;
    int          clobber;
    namelist     classes;
    int          mask;
};

The classes to generate are in classlist.

The output directory is outdir. The files that the generate_code opens for writing MUST have this output directory as a prefix, UNLESS it is NULL, in which case, it will default to ".". The function SHOULD check the length of the directory for possible buffer overflows inside the function.

If the clobber flag is set, any already existing file in the output directory MUST NOT be overwritten.

Classes is a list of the classes to create code for. Mask is a flag that inverts this selection.

Some example code:

# include "dia2code.h"
# include "decls.h"
# include "includes.h"

generate_code_foo (batch *b)
{
    umlclasslist tmplist;
    declaration *d;

    tmplist = b->classlist;
    while (tmplist != NULL) {
        if (! (is_present (b->classes, tmplist->key->name) ^ b->mask)) {
            push (tmplist, b);
            /*
                The class is pushed onto the global `decls'.
                We do this in order to support packages:
                A single package may contain multiple classes
                and these classes must be generated in the correct
                order, i.e. declare classes that depend on other
                classes towards the end.
            */
        }
        tmplist=tmplist->next;
    }
    /* Generate a file for each outer declaration.  */
    d = decls;
    while (d != NULL) {
        char *name;
        char outfilename[256];

        if (d->decl_kind == dk_module) {
            name = d->u.this_module->pkg->name;
        } else {         /* dk_class */
            name = d->u.this_class->key->name;
        }

        sprintf (outfilename, "%s.h", name);

        spec = open_outfile (outfilename, b);
        if (spec == NULL) {
            d = d->next;
            continue;
        }

        /* Determine include files needed.  */
        includes = NULL;
        determine_includes (d, b);
        if (includes) {
            namelist incfile = includes;
            while (incfile != NULL) {
                if (!eq (incfile->name, name)) {
                    emit ("# include \"%s.h\"\n", incfile->name);
                }
                incfile = incfile->next;
            }
            emit ("\n");
        }

        /*
            We generate all the code here
        */

        fclose (spec);
        d = d->next;
    }
}

Generating import/include clauses:

includes.h contains the function determine_includes() and the global variable `includes'. Handling is as shown in the example above.

determine_includes() assumes that a file is generated for each outer package and for each class not nested inside a package.

If this does not make sense for the programming language you are targeting, then there are two lower-level functions that can help the code generator know which classes are used inside the current class: find_classes() and list_classes(). The former is the eldest of the two. It returns a namelist (a list of char*) with the names of the classes used that are also declared in the same diagram (are on the classlist). The latter is a new addition in version 0.7. It does basically the same search but instead of a list of char* it returns a classlist with pointers to the diagram classes themselves. Which to use is up to you. For generators of C and SQL code the find_classes() will suffice. For generators of Java code, that use heavily the package information, the list_classes() is more suitable.

BUGS

Note: some bugs may not be listed here.

  • No checking for full filesystem while writing files.
  • Does not follow the DTD of a Dia Diagram, rather parses as it sees fit.
  • Allocates memory proportionally to the size of the diagrams. It should work OK with small diagrams but may slow down with BIG ones.

AUTHORS

Original author

Javier O'Hara joh314@users.sourceforge.net

MAINTAINER

Richard Torkar richard.torkar@htu.se

Contributors

(in alphabetical order, by last name)

THANKS

Thanks to Collin Starkweather and Slush Gore for the extra help. Also, thanks to all the people that have contacted me with suggestions and bug reports.