To-do list for xgawk enhancements: Done: - Add AWKLIBPATH with default pointing to ${libexecdir}/$PACKAGE/$VERSION - Change default shared library extension from ".so" to ".$shlibext" - Patch build infrastructure so that the current code in the extension subdirectory gets built and installed into the default $AWKLIBPATH location. - Implement @load - Patch ERRNO handling to create a simple API for use by extensions: extern void update_ERRNO_int(int) enum errno_translate { TRANSLATE, DONT_TRANSLATE }; extern void update_ERRNO_string(const char *string, enum errno_translate); extern void unset_ERRNO(void); - Add valgrind-noleak target. - Fix minor bug in fork extension, and add wait function. - Patch filefuncs extension to read symbolic links more robustly. - Add shared library tests. To do (not necessarily in this order): - Enhance extension/fork.c waitpid to allow the caller to specify the options. And add an optional array argument to wait and waitpid in which to return exit status information. - Maybe add more shared library tests. - Figure out how to support xgawk on platforms such as Cygwin where a DLL cannot be linked with unresolved references. There are currently 3 possible solutions: 1. Restructure gawk as a stub calling into a shared library. 2. Move a subset of gawk interfaces into a shared library that can be called by extensions. 3. Change the interface between gawk and extensions so that gawk will pass a pointer to a structure into dlload that contains the addresses of all variables and functions to which the extension may need access. - Enable default ".awk" search in io.c:find_source(). The simple change is to add this code inline in io.c: #ifndef DEFAULT_FILETYPE #define DEFAULT_FILETYPE ".awk" #endif - Fix lint complaints about shared library functions being called without having been defined. For example, try: gawk --lint -lordchr 'BEGIN {print chr(65)}' gawk: warning: function `chr' called but never defined A In ext.c, make_builtin needs to call awkgram.y:func_use. If done naively, I think this would result in complaints about shared library functions defined but not used. So there should probably be an enhancement to func_use and ftable to indicate if it's a shared library function. - Develop a libgawk shared library for use by extensions. In particular, a few existing extensions use a hash API for mapping string handles to structures. In xgawk, we had this API inside array.c, but it probably belongs in a separate libgawk shared library: typedef struct _strhash strhash; extern strhash *strhash_create P((size_t min_table_size)); /* Find an entry in the hash table. If it is not found, the insert_if_missing argument indicates whether a new entry should be created. The caller may set the "data" field to any desired value. If it is a new entry, "data" will be initialized to NULL. */ extern strhash_entry *strhash_get P((strhash *, const char *s, size_t len, int insert_if_missing)); typedef void (*strhash_delete_func)(void *data, void *opaque, strhash *, strhash_entry *); extern int strhash_delete P((strhash *, const char *s, size_t len, strhash_delete_func, void *opaque)); extern void strhash_destroy P((strhash *, strhash_delete_func, void *opaque)); - Running "make install" should install the new libgawk shared library as well as header files needed to build extensions under /usr/include/gawk. The extensions include "awk.h", and that pulls in the following headers (according to gcc -M) : awk.h config.h custom.h gettext.h mbsupport.h protos.h getopt.h \ regex.h dfa.h Most likely, most of this is not required. Arnold has suggested creating a smaller header to define the public interface for use by shared libraries. One could imagine having "awk-ext.h" that is included by "awk.h". Separate projects for major standalone extensions. Where should these be hosted? - Time. This defines sleep and gettimeofday. This one is quite trivial, and I propose that it be included in the mainline gawk distro. - XML - PostgreSQL - GD - MPFR. Is this still useful if MPFR support will be integrated into gawk? Possible changes requiring (further) discussion: - Change from dlopen to using the libltdl library (i.e. lt_dlopen). This may support more platforms. - Implement namespaces. Arnold suggested the following in an email: - Extend the definition of an 'identifier' to include "." as a valid character although an identifier can't start with it. - Extension libraries install functions and global variables with names that have a "." in them: XML.parse(), XML.name, whatever. - Awk code can read/write such variables and call such functions, but they cannot define such functions function XML.foo() { .. } # error or create a variable with such a name if it doesn't exist. This would be a run-time error, not a parse-time error. - This last rule may be too restrictive. I don't want to get into fancy rules a la perl and file-scope visibility etc, I'd like to keep things simple. But how we design this is going to be very important. - Include a sample rpm spec file in a new packaging subdirectory. - Patch lexer for @include and @load to make quotes optional. - Add a -i (--include) option.