Skip to content

A short excursion away from filwisher/wed to better understand piece-chains

License

Notifications You must be signed in to change notification settings

FilWisher/pisschin

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

pisschin

A short excursion away from wed to better understand piece-chains

Imgur

More detailed sources at the end of this document.

huh?

Piece-chains are a structure for managing text.

Imagine loading a big piece of text into a buffer. If you want to make edits to that text, all the changes you make are directly on the buffer. This may mean splitting or copying parts of the buffer whenever you make edits. It also means that time to load or edit a file is function of the file size.

If we create an abstraction layer over the piece of text, all of our edits are on the abstraction layer.

Piece-chains are an abstraction layer that view a text as a chain of pieces. Each piece is a description of source of data, a starting point in that data, and a length. A piece chain is a chain of these pieces which together make up a text. Now editing the text is simply editing these descriptions: for example to delete text we simple decrement the value of the length. The time to edit or load a file is now a function of the number of pieces, regardless of whether the file is 1 line or 1000000000 lines.

data types

A piece is an abstract data type of two parts: Piece and Piece_Descriptor.

A Piece is a pointer to a Piece_Descriptor:

typedef struct {
	/* ... */
} Piece_Descriptor;

typedef Piece_Descriptor *Piece;

A Piece_Descriptor is a struct with a file descriptor, a position, a length, pointer to the next Piece_Descriptor, and a pointer to the previous Piece_Descriptor:

typedef struct piece_descriptor {
	int file; /* file descriptor */
	int pos;
	int len;
	piece_descriptor *next;
	piece_descriptor prev;
} Piece_Descriptor;

The collection of Piece_Descriptors forms a doubly-linked list. This means that the Piece_Descriptors needn't be contiguous in memory.

A piece chain is a created by pointing the .next value of piece descriptor A at another piece descriptor B. Piece descriptor B should also have its .prev value set to a pointer to piece descriptor B and, if it is the final piece descriptor, it's .next value should be set to NULL.

Piece
load_file(char *name, Piece prev, Piece next)
{
  Piece p = p_alloc();
  p->file = open(name, O_RDONLY, 0);
  p->len = file_size(p->file);
  p->prev = prev;
  if (prev != NULL) prev->next = p;
  p->next = next;
  if (next != NULL) next->prev = p;
  return p;
}

/* ... */

	Piece first_piece = load_file("as.txt", NULL, NULL);
	Piece second_piece = load_file("bs.txt", first_piece, NULL);

now what?

Once we have a piece chain constructed, there a few simple things we want to do with it:

  • Map a position in the text to a position in the piece chain:
int text_pos_to_pc_pos(Piece p, int text_pos);
  • Insert new pieces anywhere in the chain
  • Delete pieces anywhere in the chain
  • Write piece chain (text) to file
  • Render piece chain (text) on screen
  • Send piece chain (text) to another process

These simple things will require slightly more code than if we were working with a buffer but it will pay off in performance.

code

data types

Location Text Piece Piece_Descriptor

functions

t_* functions are for texts

  /* returns Text struct with single piece */
  Text t_load_file(char *name);
  /* prints entire piece chain (text) */
  void t_print(Text t);
  /* frees entire piece chain (text) */
  void t_free(Text t);

e_* functions are for text editing

  /* deletes part of text */
  void e_delete(Location loc);

p_* functions are for pieces

  /* returns address of piece sized memory on heap */
	Piece_p_alloc(void);
  /* returns new piece */
	Piece p_load_file(char *name, Piece prev, Piece next);
  /* prints a piece to stdout */
	void p_print(Piece p);
  /* frees a single piece's allocated memory */
	void p_free(Piece p);
  /* Returns a Location with corresponding .piece and .offset to pos  */
	Location p_location_of(Text t, int pos);

/* takes a file descriptor and returns it size */ int file_size(int file);

sources

[0] Data Structures for Text Sequences - Charles Crowley

About

A short excursion away from filwisher/wed to better understand piece-chains

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published