Mon, 15 Mar 1999


  A step-by-step introduction to mappings:

  1. What is a mapping?

    A mapping is a datatype which allows to store data associated to a key.
  In other languages  they are also  known  as 'dictionaries' or  'alists'.
  There are also alists in LPC but they are not a separate datatype but are
  implemented on  top of arrays.  Alists are  the predecessors of mappings.
  The keys and the values  can be of  any type.  But most common  datatypes
  for keys are strings, integers and objects.  Others like arrays, mappings
  or closures aren't a good choice because comparision between i.e.  arrays
  often returns false  even if they equal  in content.  This is because the
  driver compares i.e. two arrays by their internal pointers and not by
  their content. The reason for this is simple: speed.

    Mappings  are allways  treated  as references    when passing  them  to
  functions. This means when you pass a mapping  to another object and this
  object modifies the mapping the modification will  take place in a global
  scope - visible to all objects holding this mapping in a variable.

  2. What are mappings good for?

    The term 'dictionary'  probably describes the  use  of a mapping  best.
  Opposed  to arrays mappings don't have  a specific  order. They provide a
  mechanism to   create  a set  of associations  between  values.  Such  an
  association consists of a unique  key and data  that is identified by the
  key. Think of  a dictionary  where you have  a  word and a definition  of
  it. You use the word to lookup its definition.

    Mappings can be used i.e.  to hold  aliases for commands. The key would
  then be the  name of  the alias and  the  data the command(s) behind   an
  alias.  Or they can be used for  the exits of a  room.  The keys would be
  the directions where one can go  to and the associated  data would be the
  file names of the  rooms.  But mappings can  also be used  as a kind of a
  sparse array.   A  sparse array is  an  array where most of  the elements
  aren't used  (occupied by 0).  I.e.  if  you want to  store values at the
  position 0, 13  and 37642 of an  array you would  have to create an array
  with a size of at least 37643.  This  costs a lot  of memory so a mapping
  would be  more useful because you would  then use the  numbers 0,  13 and
  37642 as a key and not as an index to a position  (actually the keys of a
  mapping are sometimes  called indices  but this  is just  because the way
  data is accessed in a mapping is similar to  arrays: by the [] operator).
  This also allows to  query all occupied   positions of a sparse array  by
  querying for all  the keys of  the mapping opposed  to an array where you
  have to iterate over all elements.

  3. How do I create a mapping?

    There are several ways to do so. The most convenient is the following:

      mapping map;
      map = ([ key0: value00; ...; value0n,
               ... : ...    ; ...; ...    ,
               keyn: valuen0; ...; valuenn ]);

    As you can see, a key may  have more than  one value assigned.  But the
  amount of values per key must always be equal.  It  is  even  possible to
  have mappings without any values!
    Another  method is  to use the   efun mkmapping().  This  efun gets two
  arguments with the first beeing an array of keys and the following beeing
  arrays of values:

      mapping map;
      map = mkmapping (({ key0   , ..., keyn    }),
                       ({ value00, ..., value0n }),
                       ({ ...    , ..., ...     }),
                       ({ valuen0, ..., valuenn }));

    If the efun only gets one argument, then this argument will be taken as
  an array of keys and a mapping  without values will  be returned.

    An empty mapping can be created by using the above described methods by
  simply ommitting the keys and values:

      mapping map;
      map = ([]);
      map = mkmapping(({}), ({}));

    Or  by  using the efun   m_allocate().  This efun gets  as  first
  argument the  amount  of keys which will  be  added soon and  an optional
  second argument specifying the width of the mapping:

      map = m_allocate(n, width);

    The value <n>  may be a bit  confusing  since mappings shrink and  grow
  dynamically. This value just tells the driver how 'long' this  mapping is
  going to be  so proper memory  allocations  will be  performed to  reduce
  the overhead of memory reallocation.  I.e.  if you want to read in a file
  and store the  read data in  a mapping  you probably  know  the amount of
  keys.  So you allocate  a mapping with this  efun and tell the driver how
  much memory should  be allocated  by specifing  a proper <n>  value.
  Thus causing  a    speedup when adding  the  read   data to the   mapping
  afterwards.    The <width> just specifies how   many  values per key this
  mapping is   going to have. If  no  width is given, 1  will  be  taken as

  An empty mapping created with '([])' will always have a width of 1. To
  create empty mappings with other widths, write it as

      map = ([:width ]);

  <width> can be any expression returning an integer value (including
  function calls), and in fact this notation is just a fancy way of

      map = m_allocate(0, width);

  4. How can I modify the data of a mapping?

    Adding a  new key is similiar to   modifying the associated  data of an
  existing key:

      map += ([ key: value0; ...; valuen ]);

    Or in case only a single value should be modified:

      map[key, n] = valuen;

    If  <n> is out of  range or if <key> doesn't  exists and <n> is greater
  than 0 an "Illegal index" error will be reported. If <n> is equal to 0 or
  the mapping only has a single value per key one can abbreviate it with:

      map[key] = value;

    If there is no <key> (and <n> is equal to 0 or  not specified at all) a
  new one will be added automatically.

    Deletion   of a key    is  done with    the  -=  operator or  the  efun
  m_delete(). A mapping can only be substracted by one without any values:

      map -= ([ key ]);
      map -= ([ key0, ..., keyn ]);

  The efun takes a mapping as first and a key as second argument:

      m_delete(map, key);

    The  efun   m_delete() returns  the mapping   but because  mappings are
  handled as references there is no need of an assignment like:

      map = m_delete(map, key);

  5. How can I access the data stored in a mapping?

    This can be done by:

      valuen = map[key, n];

    Or in case of a mapping with just one value per key:

      value0 = map[key];

    If there is no  <key> in the mapping  and <n> is  0 or not specified at
  all (which is the same) a 0 will be returned or if <n>  is greater than 0
  an "Illegal index" error will be reported.

  6. How can I test for the existance of a key?

    A  return value of 0 is  sufficient for most applications but sometimes
  the ambiguity  between an existing value of  0 and  a nonexisting key can
  lead   to  a  problem.  Therefore   one  can use   the  efun member()  or
  mapping_contains() to check if there actually is a key in the mapping:

      if (member(map, key)) {
      if (mapping_contains(&value0, ..., &valuen, map, key)) {

    This also shows how  one can retrieve all values   associated to a  key
  from a mapping in a single step. The '&' is  the reference operator which
  is neccesary to let the efun store the values in the variables.

    In case   of  mappings   with   no  values,   the  efun   member()  and
  mapping_contains() are equal in their behaviour  and their way of calling
  because mapping_contains() won't get any reference variables to store the
  values in (obviously, because there aren't any).

     Also normally member() is known to return the postion of an element in
  a list (i.e.  a  character in a  string or data   in an array) and if  an
  element couldn't be  found -1 is returned.   But in the case  of mappings
  there are no such things as order and postion. So member() only returns 0
  or 1.

  7. How can I copy a mapping?

    A  mapping can  be  copied   with  the  +  operator   or by the    efun

      newmap = ([]) + map;
      newmap = copy_mapping(map);

    A mapping should only be copied when it is neccesary to get an own copy
  of it that  must not be  shared by other objects.

  8. How can I get all keys of a mapping?

    The  efun m_indices() gets a mapping  as argument  and returns an array
  holding all keys defined in this mapping:

      keys = m_indices(map);

  9. How can I get all the values of a mapping?

    The efun m_values() gets  a mapping as  argument  and returns  an array
  holding all the first (second, ...) values of it.

      values0 = m_values(map);     returns the first values
      values0 = m_values(map, 0);  dito
      values1 = m_values(map, 1);  returns the second values

  10. How can I determine the size of a mapping?

    Because a mapping is a kind of rectangle it has two sizes: a length and
  a width.  There are three different efuns  to query these values. The first
  two are the  efuns sizeof(), which returns the  amount of key-value
  associations (the length of  a mapping), and widthof(), which returns the
  number of values per key (the width). The third is the efun get_type_info().
  get_type_info() is meant  to be a function  to identify a datatype.   Its
  return value is an  array of two  numerical values.  The first  specifies
  the datatype   of the argument and   the second is a   datatype dependend
  value. In the case of a mapping the first value  is T_MAPPING (which is a
  value defined in  <lpctypes.h>) and the  second the amount of values  per
  key (a.k.a.  columns or the width  of the mapping  - actually it would be
  correct to say that the width of a mapping is the  amount of columns plus
  one for the keys but this is uncommon).

  11. What is the best method to iterate over a mapping?

    First of all the main purpose of a mapping is not meant to  be a set of
  data to iterate over. Afterall the keys in a mapping have no specific but
  a random order (at least on the LPC side).  But  still it is possible and
  sometimes even neccesary to do so.

    If all key-value associations  should be processed  then one should use
  walk_mapping().  If all keys of a mapping should be processed to create a
  new mapping being a subset of the given one, then filter_mapping() should
  be  used.  If all  keys  are going to  be  processed and  to create a new
  mapping with the  same set of keys as  the given mapping, then one  would
  use map_mapping().  But in the case of an  iteration that should/can stop
  even if not all data is processed it is probably wise to iterate over the
  mapping by first querying for the keys and then to iterate over them with
  a for() or a while() loop and querying the values by 'hand'.

    The efun walk_mapping() gets  a mapping as  first argument and the name
  of a function  as second one. All the  following arguments are treated as
  extras which  will  be  passed to the   function specified  with the  2nd
  argument. Instead of a string for the name of a function a closure can be
  used, too. Nothing will be returned:

      walk_mapping(map, "func", xarg0, ..., xargn);

      void func(mixed key, mixed value0, ..., mixed valuen,
                mixed xarg0, ..., mixed xargn) {

    func() will be called for all key-value associations  and gets as first
  argument the key.  The next arguments are the  values behind the key  and
  are passed as references.  The  rest  of the  passed arguments are  those
  specified as extras. Because the values are passed as references (opposed
  to  copies) it is possible  to modify them  from  inside func() by simply
  assigning new value to the variables <value0>, ..., <valuen>.

    The efun filter_mapping() calls  a function for  each key in  a mapping
  and creates a new mapping  which only contains key-value associations for
  which the called function returned true (not  equal 0 that is). The first
  argument is the mapping to iterate over and the second is a function name
  given as a string or a closure:

      submap = filter_mapping(map, "func", xarg0, ..., xargn);

      int func(mixed key, mixed xarg0, ..., mixed xargn) {

    func() gets  as first argument the key  and the others are those passed
  as extras to filter_mapping().

    The efun map_mapping() gets a mapping as first argument and a string as
  a function name (or again a closure) as  second argument.  Any additional
  arguments are again used  as extras that will  be passed to the iteration
  function. This efun returns a new mapping with the same keys as the given
  one.  The values  returned by the function  that is invoked  for each key
  will be used as the associated data behind each key of the new mapping:

      newmap = map_mapping(map, "func", xarg0, ..., xargn);

      mixed func(mixed key, mixed xarg0, ..., mixed xargn) {

    func() gets  as first argument the key  and the others are those passed
  as extras to map_mapping().

    Because a function can only return  a single value  (even when it is an
  array) it restricts the use  of map_mapping() to  only allow creation  of
  mappings with a single value per key.

  12. Is it possible to join/intersect/cut mappings with another?

    Joining mappings is only possible, if  they have the same width (amount
  of values per key). One can use the + and += operator:

      map = map1 + map2 + ... + mapn;
      map += map1 + map2 + ... + mapn;

    Intersection     of   two   mappings is    only      possible by  using
  filter_mapping(). There is  no efun or operator  which features this. The
  'easiest' way may be the following function:

      mapping intersect_mapping(mapping map1, mapping map2) {
        closure cl;

        cl = lambda(({ 'key }), ({ #'member, map2, 'key }));
        return filter_mapping(map1, cl, map2);

    This function returns a  new mapping which   consists of all  key-value
  associations   of  <map1>  for which  an  equal  key  could   be found in
  <map2>. This function uses  a closure which returns 0  or 1  depending on
  wether a key from <map1> is contained in <map2> or not.

    Cutting out  all key-value associations of a   mapping for which  a key
  could be  found in another mapping  can  be done  by using  the  - and -=

      mapping cut_mapping(mapping map1, mapping map2) {
        return map1 - mkmapping(m_indices(map2));

    Because a maping can  only be substracted by one  without any values we
  first have to create such by using m_indices() and mkmapping().

  13. What are those mappings without any values (besides keys) good for?

    Because the way how the driver  searches for a  key in a mapping is
  rather fast, those mappings can be used as a  set of elements with a fast
  method for testing if an element is  contained in the set. This technique
  is called hashing (further  explanation   would lead  too far)  which  is
  faster  than searching for  values  in array  (which  is done in a linear

    Another (maybe  more pratical) use  of these  mappings  are to create a
  array of unique values out of an array with several equal values:

      uniques = m_indices(mkmapping(array));

    mkmapping() uses  <array> to  create  a mapping without any  values but
  just keys. And  because a mapping can only  have unique keys all multiple
  values in <array> are taken as one.  The call of m_indices() then returns
  an  array  of  these  unique keys.  Actually we  only  make  use of those
  mappings temporarily.

  14. How can I convert an alist into a  mapping and vice versa?

    There are no special efuns which handle such conversions. But it can be
  done by the following functions:

      mapping alist_to_mapping(mixed *alist) {
        return apply(#'mkmapping, alist);

    The efun apply() takes a closure and an array of values and passes each
  element of the  array as an  argument  to the  closure. Because  an alist
  consists of an array of arrays with the first beeing the list of keys and
  the others the values associated to each key passing them as arguments to
  the efun closure #'mkmapping via apply() causes the creation of a mapping
  out of an alist.

      mixed *mapping_to_alist(mapping map) {
        mixed *alist;
        symbol *vars;
        string var;
        closure cl;
        int width;

        width = get_type_info(map)[1];
        alist = allocate(width + 1);
        vars  = allocate(width + 2);
        for (var = "a"; width; var[0]++, width--) {
          alist[width] = ({});
          vars[width]  = quote(var);
        alist[0] = ({});
        vars[0]  = 'key;
        vars[<1] = 'alist;
        cl = lambda(vars, ({ #'=, 'alist, ({ #'insert_alist }) + vars }));
        walk_mapping(map, cl, &alist);
        return alist;

    This function is  a bit more  complicated  than the other  and detailed
  description would lead   too far of   the topic.  This  function has  one
  restriction:  it can only turn a  mappings with up to  26  values per key
  into an  alist.    But  this  should  be   sufficient for probably    all
  applications which use mappings.

  And Hyps further comment on this:
        The function mapping_to_alist() is also not that
        clever because insert_alist() allways creates a new
        alist.  A second (optional) argument to m_values() to
        specify the value column would be better. Besides
        this, the conversion of a mapping into an alist could
        be done by to_array().

  15. Dirty Mappings

  'Dirty mappings' are nothing the LPC programmer directly is involved
  with, however, as it relates to the way mappings are implemented
  internally by the gamedriver. However, as this term shows up in
  various driver statistics, it is explained here.

  There are two fundamental approaches to implement mappings:

    1. Store all data entries in an array-like structure, in sorted order. 
    2. Store all data in a hashtable, each entry allocaed separately.

  Method 1 is very space efficient, as it doesn't need much overhead
  per entry; however, insertions and deletions of entries are
  relatively slow as all other entries need to be moved.     
  Method 2 is very fast as nothing needs to be moved in memory,
  however it has a large overhead.

  The gamedriver uses a hybrid method: at the basis is a mapping
  implementation based on arrays. However the driver uses a hash table
  in addition to handle all the ongoing insertions and deletions.
  Every once in a while, the contents of the hash table are sorted
  into the base array, reasoning that any entry surviving for longer
  time in the hash table is worth keeping in a more space-efficient
  manner. 'Dirty' mappings are such mappings with both an array and a
  hash part, 'clean' mappings are those with just an array part.


        The ([:width ]) notation was added in LDMud 3.2.9/3.3.208 .


	alists(LPC), closures(LPC), structs(LPC), mkmapping(E),

UNItopia (