mclgTF* mclgTFparse ( mcxLink* encoding_link , mcxTing* thestring ) { mclgTF* gtf = mcxAlloc(sizeof gtf[0], EXIT_ON_FAIL) ; const char* me = "mclgTFparse" ; const char* a = thestring->str ; const char* z = thestring->str + thestring->len ; mcxTing* func = mcxTingEmpty(NULL, thestring->len) ; mcxTing* arg = mcxTingEmpty(NULL, thestring->len) ; int n = 0 ; if (!(gtf->par_edge = mclpARensure(NULL, 10))) return NULL /* +memleak gtf */ ; if (!(gtf->par_graph = mclpARensure(NULL, 10))) return NULL /* +memleak gtf, gtf->par_edge */ ; if ( thestring && !mcxStrChrAint(thestring->str, isspace, thestring->len) ) return gtf ; while (a < z) { const char* val, *key ; char* onw = NULL ; int tfe = -1, tfg = -1 ; mcxbool nought = FALSE ; unsigned char k0 ; double d ; int t ; mcxTingEmpty(arg, z-a) ; mcxTingEmpty(func, z-a) ; n = 0 ; if ((t = sscanf(a, " %[a-z_#-] ( )%n", func->str, &n)) >= 1 && n > 0) NOTHING ; else if ((t = sscanf(a, " %[a-z_#-] ( %[^)_ ] )%n", func->str, arg->str, &n)) >= 2 && n > 0) NOTHING ; else break ; a += n ; key= func->str ; val= arg->str ; k0 = key[0] ; d = strtod(val, &onw) ; if (!val || !strlen(val)) nought = TRUE ; else if (val == onw) { mcxErr(me, "failed to parse number <%s>", val) ; break ; } if (k0 == '#') { if (!strcmp(key, "#ceilnb")) tfg = MCLG_TF_CEILNB ; else if (!strcmp(key, "#knn")) tfg = MCLG_TF_KNN ; else if (!strcmp(key, "#n")) tfg = MCLG_TF_TOPN ; else if (!strcmp(key, "#ils")) tfg = MCLG_TF_ILS ; else if (!strcmp(key, "#mcl")) tfg = MCLG_TF_MCL ; else if (!strcmp(key, "#arcmcl")) tfg = MCLG_TF_ARC_MCL ; else if (!strcmp(key, "#arcsub")) tfg = MCLG_TF_ARCSUB ; else if (!strcmp(key, "#arcmax")) tfg = MCLG_TF_ARCMAX ; else if (!strcmp(key, "#arcmingq")) tfg = MCLG_TF_ARCMINGQ ; else if (!strcmp(key, "#arcmingt")) tfg = MCLG_TF_ARCMINGT ; else if (!strcmp(key, "#arcmimlq")) tfg = MCLG_TF_ARCMINLQ ; else if (!strcmp(key, "#arcminlt")) tfg = MCLG_TF_ARCMINLT ; else if (!strcmp(key, "#arcdiffgq")) tfg = MCLG_TF_ARCDIFFGQ ; else if (!strcmp(key, "#arcdiffgt")) tfg = MCLG_TF_ARCDIFFGT ; else if (!strcmp(key, "#arcdifflq")) tfg = MCLG_TF_ARCDIFFLQ ; else if (!strcmp(key, "#arcdifflt")) tfg = MCLG_TF_ARCDIFFLT ; else if (!strcmp(key, "#arcmaxgq")) tfg = MCLG_TF_ARCMAXGQ ; else if (!strcmp(key, "#arcmaxgt")) tfg = MCLG_TF_ARCMAXGT ; else if (!strcmp(key, "#arcmaxlq")) tfg = MCLG_TF_ARCMAXLQ ; else if (!strcmp(key, "#arcmaxlt")) tfg = MCLG_TF_ARCMAXLT ; else if (!strcmp(key, "#selfrm")) tfg = MCLG_TF_SELFRM ; else if (!strcmp(key, "#selfmax")) tfg = MCLG_TF_SELFMAX ; else if (!strcmp(key, "#normself")) tfg = MCLG_TF_NORMSELF ; else if (!strcmp(key, "#add")) tfg = MCLG_TF_ADD ; else if (!strcmp(key, "#max")) tfg = MCLG_TF_MAX ; else if (!strcmp(key, "#min")) tfg = MCLG_TF_MIN ; else if (!strcmp(key, "#mul")) tfg = MCLG_TF_MUL ; else if (!strcmp(key, "#tug")) tfg = MCLG_TF_TUG ; else if (!strcmp(key, "#ssq")) tfg = MCLG_TF_SSQ ; else if (!strcmp(key, "#qt")) tfg = MCLG_TF_QT ; else if (!strcmp(key, "#tp") || !strcmp(key, "#rev")) tfg = MCLG_TF_TRANSPOSE ; else if (!strcmp(key, "#step")) tfg = MCLG_TF_STEP ; else if (!strcmp(key, "#thread")) tfg = MCLG_TF_THREAD ; else if (!strcmp(key, "#shrug")) tfg = MCLG_TF_SHRUG ; else if (!strcmp(key, "#shuffle")) tfg = MCLG_TF_SHUFFLE ; } else { if (!strcmp(key, "gq")) tfe = MCLX_UNARY_GQ ; else if (!strcmp(key, "gt")) tfe = MCLX_UNARY_GT ; else if (!strcmp(key, "lt")) tfe = MCLX_UNARY_LT ; else if (!strcmp(key, "lq")) tfe = MCLX_UNARY_LQ ; else if (!strcmp(key, "rand")) tfe = MCLX_UNARY_RAND ; else if (!strcmp(key, "mul")) tfe = MCLX_UNARY_MUL ; else if (!strcmp(key, "scale")) tfe = MCLX_UNARY_SCALE ; else if (!strcmp(key, "add")) tfe = MCLX_UNARY_ADD ; else if (!strcmp(key, "abs")) tfe = MCLX_UNARY_ABS ; else if (!strcmp(key, "ceil")) tfe = MCLX_UNARY_CEIL ; else if (!strcmp(key, "floor")) tfe = MCLX_UNARY_FLOOR ; else if (!strcmp(key, "pow")) tfe = MCLX_UNARY_POW ; else if (!strcmp(key, "exp")) tfe = MCLX_UNARY_EXP ; else if (!strcmp(key, "log")) tfe = MCLX_UNARY_LOG ; else if (!strcmp(key, "neglog")) tfe = MCLX_UNARY_NEGLOG ; } if (tfe < 0 && tfg < 0) { mcxErr(me, "unknown value transform <%s>", key) ; break ; } if (tfe >= 0) { if (nought) { if ( tfe == MCLX_UNARY_LOG || tfe == MCLX_UNARY_ABS || tfe == MCLX_UNARY_EXP || tfe == MCLX_UNARY_NEGLOG ) d = 0.0 ; else { mcxErr(me, "transform <%s> needs value", key) ; break ; } ; } mclpARextend(gtf->par_edge, tfe, d) ; } else if (tfg >= 0) { if (nought) { if ( tfg >= MCLG_TF_DUMMY_NOVALUE_START && tfg <= MCLG_TF_DUMMY_NOVALUE_END ) d = 0.0 ; else if (tfg == MCLG_TF_TUG || tfg == MCLG_TF_SHRUG) d = 1000.0 ; else if (tfg == MCLG_TF_STEP) d = 2.0 ; else { mcxErr(me, "transform <%s> needs value", key) ; break ; } ; } mclpARextend(gtf->par_edge, MCLX_UNARY_UNUSED, 0.0) ; mclpARextend(gtf->par_graph, tfg, d) ; } a = mcxStrChrAint(a, isspace, z-a) ; if (!a || a[0] != ',') break ; a++ ; } if (a) { mcxErr(me, "trailing part <%s> not matched", a) ; mclpARfree(&(gtf->par_edge)) ; mcxFree(gtf) ; gtf = NULL ; } return gtf ; }
static mcxstatus read_abc ( mcxIO* xf , mcxTing* buf , stream_state *iface , double* value ) { mcxstatus status = mcxIOreadLine(xf, buf, MCX_READLINE_CHOMP) ; mcxTing* xkey = mcxTingEmpty(NULL, buf->len) ; mcxTing* ykey = mcxTingEmpty(NULL, buf->len) ; mcxbits bits = iface->bits ; mcxbool strict = bits & MCLXIO_STREAM_STRICT ; mcxbool warn = bits & MCLXIO_STREAM_WARN ; mcxbool label_cbits = bits & (MCLXIO_STREAM_CTAB_STRICT | MCLXIO_STREAM_CTAB_RESTRICT) ; mcxbool label_rbits = bits & (MCLXIO_STREAM_RTAB_STRICT | MCLXIO_STREAM_RTAB_RESTRICT) ; mcxbool label_dbits = bits & (MCLXIO_STREAM_WARN | MCLXIO_STREAM_DEBUG) ; const char* printable ; int cv = 0 ; iface->statusx = STATUS_OK ; iface->statusy = STATUS_OK ; do { int xlen = 0 ; int ylen = 0 ; if (status) break ; printable = mcxStrChrAint(buf->str, isspace, buf->len) ; if (printable && (uchar) printable[0] == '#') { status = mcxIOreadLine(xf, buf, MCX_READLINE_CHOMP) ; continue ; } mcxTingEnsure(xkey, buf->len) /* fixme, bit wasteful */ ; mcxTingEnsure(ykey, buf->len) /* fixme, bit wasteful */ ; cv = strchr(buf->str, '\t') ? sscanf(buf->str, "%[^\t]\t%[^\t]%lf", xkey->str, ykey->str, value) : sscanf(buf->str, "%s%s%lf", xkey->str, ykey->str, value) /* WARNING: [xy]key->len have to be set. * we first check sscanf return value */ ; if (cv == 2) *value = 1.0 ; else if (cv != 3) { if (warn || strict) mcxErr ( module , "abc-parser chokes at line %ld [%s]" , (long) xf->lc , buf->str ) ; if (strict) { status = STATUS_FAIL ; break ; } status = mcxIOreadLine(xf, buf, MCX_READLINE_CHOMP) ; continue ; } else if (!(*value <= FLT_MAX)) /* should catch nan, inf */ *value = 1.0 ; xlen = strlen(xkey->str) ; ylen = strlen(ykey->str) ; xkey->len = xlen ; ykey->len = ylen ; status = iface->statusx = handle_label(&xkey, &(iface->x), iface->map_c, label_cbits | label_dbits, "col") ; if (status == STATUS_FAIL || status == STATUS_IGNORE) break ; status = iface->statusy = handle_label(&ykey, &(iface->y), iface->map_r, label_rbits | label_dbits, "row") ; if (status == STATUS_FAIL || status == STATUS_IGNORE) break ; status = STATUS_OK /* Note: status can never be STATUS_NEW */ ; break ; } while (1) ;if(DEBUG2) fprintf(stderr, "read_abc status %s\n", MCXSTATUS(status)) ; if (status == STATUS_NEW) mcxErr(module, "read_abc panic, because status == STATUS_NEW") /* Below we remove the key from the map if it should be * ignored. It will be freed in the block following this one. */ ; if ( iface->statusx == STATUS_NEW && (iface->statusy == STATUS_FAIL || iface->statusy == STATUS_IGNORE) ) { mcxHashSearch(xkey, iface->map_c->map, MCX_DATUM_DELETE) ; iface->map_c->max_seen-- ; iface->statusx = STATUS_IGNORE ; } else if /* Impossible (given that we break when iface->statusx) but defensive */ ( iface->statusy == STATUS_NEW && (iface->statusx == STATUS_FAIL || iface->statusx == STATUS_IGNORE) ) { mcxHashSearch(ykey, iface->map_r->map, MCX_DATUM_DELETE) ; iface->map_r->max_seen-- ; iface->statusy = STATUS_IGNORE ; } /* NOTE handle_label might have set either to NULL but * that's OK. This is needed because handle_label(&xkey) * might succeed and free xkey (because already present in * map_c->map); then when handle_label(&ykey) fails we need to * clean up. */ ; if (status) { mcxTingFree(&xkey) /* kv deleted if iface->statusx == STATUS_NEW */ ; mcxTingFree(&ykey) /* kv deleted if iface->statusy == STATUS_NEW */ ; } return status ; }
static mcxstatus read_123 ( mcxIO* xf , mcxTing* buf , stream_state* iface , mclxIOstreamer* streamer , double* value , mcxbits bits ) { mcxstatus status = mcxIOreadLine(xf, buf, MCX_READLINE_CHOMP) ; int cv = 0 ; const char* printable ; const char* me = module ; mcxbool strict = bits & MCLXIO_STREAM_STRICT ; mcxbool warn = bits & MCLXIO_STREAM_WARN ; unsigned long x = 0, y = 0 ; while (1) { if (status) break ; status = STATUS_FAIL ; printable = mcxStrChrAint(buf->str, isspace, buf->len) ; if (printable && (unsigned char) printable[0] == '#') { status = mcxIOreadLine(xf, buf, MCX_READLINE_CHOMP) ; continue ; } cv = sscanf(buf->str, "%lu%lu%lf", &x, &y, value) ; if (x > LONG_MAX || y > LONG_MAX) { mcxErr (me, "negative values in input stream? unsigned %lu %lu", x, y) ; break ; } if (cv == 2) *value = 1.0 ; else if (cv != 3) { if (strict || warn) mcxErr ( module , "123-parser chokes at line %ld [%s]" , (long) xf->lc , buf->str ) ; if (strict) break ; status = mcxIOreadLine(xf, buf, MCX_READLINE_CHOMP) ; continue ; } else if (!(*value < FLT_MAX)) *value = 1.0 ; if ( (streamer->cmax_123 && x >= streamer->cmax_123) || (streamer->rmax_123 && y >= streamer->rmax_123) ) { status = STATUS_IGNORE ; break ; } status = STATUS_OK ; break ; } if (!status) { iface->x = x ; iface->y = y ; if (iface->map_c->max_seen+1 < x+1) /* note mixed-sign comparison */ iface->map_c->max_seen = x ; if (iface->map_r->max_seen+1 < y+1) /* note mixed-sign comparison */ iface->map_r->max_seen = y ; } return status ; }
/* Purpose: read a single x/y combination. The x may be cached * due to the etc format, where a single line always refers to the same x * and that x is listed only at the start or line, or omitted with * the etc-ai and 235-ai formats. * * state->x_prev may be used by read_etc in order to obtain the * current x index. */ static mcxstatus read_etc ( mcxIO* xf , stream_state *iface , etc_state *state , double* value ) { mcxbits bits = iface->bits ; FILE* stdbug = stdout ; mcxstatus status = STATUS_OK ; mcxTing* ykey = NULL ; mcxTing* xkey = NULL ; const char* printable ; mcxbool label_cbits = bits & (MCLXIO_STREAM_CTAB_STRICT | MCLXIO_STREAM_CTAB_RESTRICT) ; mcxbool label_rbits = bits & (MCLXIO_STREAM_RTAB_STRICT | MCLXIO_STREAM_RTAB_RESTRICT) ; mcxbool label_dbits = bits & (MCLXIO_STREAM_WARN | MCLXIO_STREAM_DEBUG) ; iface->statusx = STATUS_OK ; iface->statusy = STATUS_OK ; iface->x = state->x_prev ; *value = 1.0 ;if(DEBUG)fprintf(stdbug, "read_etc initially set x to %d\n", (int) iface->x) ; if (!state->etcbuf) state->etcbuf = mcxTingEmpty(NULL, 100) ; do { int n_char_read = 0 ; if (state->etcbuf->len != state->etcbuf_check) { mcxErr ( module , "read_etc sanity check failed %ld %ld" , (long) state->etcbuf->len , (long) state->etcbuf_check ) ; status = STATUS_FAIL ; break ; } /* do we need to read a line ? */ /* -> then set iface->x */ /* fixmefixme: funcify this */ /* iface->x can only be changed in this branch */ /* ************************************************************************** */ if (state->etcbuf_ofs >= state->etcbuf->len) { state->etcbuf_ofs = 0 ; state->n_y = 0 ; if ((status = mcxIOreadLine(xf, state->etcbuf, MCX_READLINE_CHOMP))) break ; state->etcbuf_check = state->etcbuf->len ; if ( !(printable = mcxStrChrAint(state->etcbuf->str, isspace, -1)) || (unsigned char) *printable == '#' ) { state->etcbuf_ofs = state->etcbuf->len ; iface->statusy = STATUS_IGNORE ; break /* fixme: ^ statusx seems to work as well. cleanify design */ ; } ; if (bits & (MCLXIO_STREAM_ETC_AI | MCLXIO_STREAM_235_AI)) { } /* In this branch we do not issue handle_label, so we take care of max_seen. */ else if (bits & MCLXIO_STREAM_235) { if (1 != sscanf(state->etcbuf->str, "%lu%n", &(iface->x), &n_char_read)) { iface->statusx = STATUS_FAIL ; break ; } state->etcbuf_ofs += n_char_read ; if (iface->map_c->max_seen+1 < iface->x+1) /* note mixed-sign comparison */ iface->map_c->max_seen = iface->x ; state->x_prev = iface->x ; } else if (bits & MCLXIO_STREAM_ETC) { xkey = mcxTingEmpty(NULL, state->etcbuf->len) ; if (1 != sscanf(state->etcbuf->str, "%s%n", xkey->str, &n_char_read)) break ; state->etcbuf_ofs += n_char_read ; xkey->len = strlen(xkey->str) ; xkey->str[xkey->len] = '\0' ;if(DEBUG3)fprintf(stderr, "max %lu\n", (ulong) iface->map_c->max_seen) ; iface->statusx = handle_label(&xkey, &(iface->x), iface->map_c, label_cbits | label_dbits, "col") ;if(DEBUG3)fprintf(stderr, "max %lu x %lu\n", (ulong) iface->map_c->max_seen, (ulong) iface->x) ; if (iface->statusx == STATUS_IGNORE || iface->statusx == STATUS_FAIL) { /* iface->x = 141414 recentlyadded */ ;if(DEBUG3)fprintf(stderr, "max %lu\n", (ulong) iface->map_c->max_seen) ; break ; } /* ^ Consider what happens when we break here (x label not * accepted) with map_c->max_seen. Basically x label is * indepedent of y, so we never need to undo the * handle_label action. */ state->x_prev = iface->x ; } else mcxDie(1, module, "strange, really") ; } /* ************************************************************************** */ if ( !( printable = mcxStrChrAint(state->etcbuf->str+state->etcbuf_ofs, isspace, -1) ) || (uchar) *printable == '#' ) { state->etcbuf_ofs = state->etcbuf->len ; /* iface->y = 141414 recentlyadded */ ; iface->statusy = STATUS_IGNORE ; break ; } if (bits & (MCLXIO_STREAM_235_AI | MCLXIO_STREAM_235)) { if (1 != sscanf(state->etcbuf->str+state->etcbuf_ofs, "%lu%n", &(iface->y), &n_char_read)) { char* s = state->etcbuf->str+state->etcbuf_ofs ; while(isspace((uchar) s[0])) s++ ; mcxErr ( module , "unexpected string starting with <%c> on line %lu" , (int) ((uchar) s[0]) , xf->lc ) ; iface->statusy = STATUS_FAIL ; } else { ;if(DEBUG3)fprintf(stdbug, "hit at %d\n", (int) state->etcbuf_ofs); state->etcbuf_ofs += n_char_read ; if (iface->map_r->max_seen+1 < iface->y+1) /* note mixed-sign comparison */ iface->map_r->max_seen = iface->y ; } } else /* ETCANY */ { ykey = mcxTingEmpty(NULL, state->etcbuf->len) ; if (1 != sscanf(state->etcbuf->str+state->etcbuf_ofs, "%s%n", ykey->str, &n_char_read)) break ; ykey->len = strlen(ykey->str) ; ykey->str[ykey->len] = '\0' ; state->etcbuf_ofs += n_char_read ; iface->statusy = handle_label(&ykey, &(iface->y), iface->map_r, label_rbits | label_dbits, "row") ; } /* this won't scale well in terms of organisation if and when * tabs are allowed with 235 mode, because in that case, * with 235-ai and restrict-tabr and extend-tabc we will * need the stuff below duplicated in the 235 branch above. * what happens here is that we only decide now whether * the auto-increment is actually happening. It depends * on there being at least one y that was not rejected. */ ; if ( (bits & (MCLXIO_STREAM_ETC_AI | MCLXIO_STREAM_235_AI)) && (iface->statusy == STATUS_OK || iface->statusy == STATUS_NEW) && !state->n_y ) { iface->x = state->x_prev + 1 /* works first time around */ ; iface->map_c->max_seen = state->x_prev + 1 ; state->n_y++ ; state->x_prev = iface->x ; } ;if(DEBUG2)fprintf(stdbug, "etc handle label we have y %d status %s\n", (int) iface->y, MCXSTATUS(iface->statusy)) ; } while (0) ;if(DEBUG2)fprintf(stdbug, "status %s\n", MCXSTATUS(status)) ; do { if (status) /* e.g. STATUS_DONE (readline) [or STATUS_IGNORE (#)]*/ break /* below iface->statusy == STATUS_NEW should be impossible * given this clause and the code sequence earlier. */ ; if (iface->statusx == STATUS_FAIL || iface->statusx == STATUS_IGNORE) { mcxTingFree(&xkey) ; status = iface->statusx ; break ; } /* case iface->statusx == STATUS_NEW is *always* honored */ ; if (iface->statusy == STATUS_FAIL || iface->statusy == STATUS_IGNORE) { mcxTingFree(&ykey) ; status = iface->statusy ; break ; } } while (0) ; if (status == STATUS_IGNORE || status == STATUS_FAIL) mcxTingFree(&ykey) /* fixme, the action in this branch is done in other places too. cleanify design */ ; if ( iface->statusx == STATUS_IGNORE || !mcxStrChrAint(state->etcbuf->str+state->etcbuf_ofs, isspace, -1) ) state->etcbuf_ofs = state->etcbuf->len ;if(DEBUG3)fprintf ( stdbug, "read_etc %s return x(%s -> %d stat=%s) y(%s -> %d stat=%s) status %s buf %d %d c_max_seen %lu\n" , MCXSTATUS(status) , (xkey ? xkey->str : "-"), (int) iface->x, MCXSTATUS(iface->statusx) , (ykey ? ykey->str : "-"), (int) iface->y, MCXSTATUS(iface->statusy) , MCXSTATUS(status), (int) state->etcbuf->len, (int) state->etcbuf_ofs , (ulong) iface->map_c->max_seen ) ; return status ; }