Skip to content

geoffmcl/edbrowse

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

edbrowse, a line oriented editor browser.
Written and maintained by Karl Dahlke and others.
See our home page edbrowse.org for current releases and contact information.

See COPYING for licensing agreements.

------------------------------------------------------------

Disclaimer: this software is provided as-is,
with no guarantee that it will perform as expected.
It might trash your precious files.
It might send bad data across the Internet,
causing you to buy a $37,000 elephant instead of
$37 worth of printer supplies.
It may delete all the rows in your mysql customer table.
Use this program at your own risk.

------------------------------------------------------------

Chrome and Explorer are graphical browsers.
Lynx and Links are screen browsers.
This is a command line browser, the only one of its kind.
The user's guide can be found as doc/usersguide.html in this package,
or online at http://edbrowse.org/usersguide.html.
The online guide corresponds to the latest stable release.
Of course this reasoning is a bit circular.
You need to use a browser to read the documentation,
which describes how to use the browser.
Well you can always do this:

cd doc ; lynx -dump usersguide.html >usersguide.txt

This produces the documentation in text form,
which you can read using your favorite editor.
Of course we hope edbrowse will eventually become your
favorite editor, whence you can browse the documentation directly.
The doc directory also includes a sample config file.

------------------------------------------------------------

OK, I'm going to assume you've read the documentation.
No need to repeat all that here.
You're here because you want to compile and/or package the program,
or modify the source in some way.  Great!

Requirements:

pcre:
As you may know, edbrowse was originally a perl script.
As such, it was only natural to use perl regular expressions for
the search/substitute functions in the editor.
Once you've experienced the power of perl regexp, you'll never
go back to ed.  So I use the perl-compatible regular expression
library, /lib/libpcre.so.0, available on most Linux systems.
If you don't have this file, check your installation disks.
the pcre and pcre-devel packages might be there, just not installed.
You need version 8.10 or higher.

Note that my files include <pcre.h>.
Some distributions put it in /usr/include/pcre/pcre.h,
so you'll have to adjust the source, the -I path, or make a link.

libcurl:
You need libcurl and libcurl-devel,
which are included in almost every Linux distro.
This is used for ftp, http, and https.
Check for /usr/include/curl/curl.h
Edbrowse requires version 7.29.0 or later.  If you compiled with a version
prior to 7.29.0, the program will inform you that you need to upgrade.
If you have to compile curl from source, be sure to specify
--ENABLE-VERSION-SYMBOLS at the configure script.
It's rare, but curl, and hence edbrowse, cannot access certain websites,
giving the message
Cannot communicate securely with peer: no common encryption algorithm(s).
You can even see this from the command line.
	curl https://weloveanimals.me
You'll either get the communication error or not.
This happens if openssl is too old,
or just doesn't support the ciphers that the website expects.
This is beyond edbrowse, and beyond curl; you have to upgrade openssl.

tidy:
Edbrowse now uses the tidy-html5 HTML parser.  So there are a couple
of things to install for this prerequisite.
The tidy-html5 compilation process uses cmake.  Please either use your
package manager to get cmake (for instance, apt-get install cmake),
or follow the instructions at http://www.cmake.org/download/

Once you have cmake, follow the latest tidy-html5 code from:
git clone git://github.com/htacg/tidy-html5
cd tidy-html5/build/cmake
cmake ../..
make
make install # as root
Now the latest tidy-html5 library will be available to edbrowse.
you may have to run ldconfig to access this new library.
Edbrowse requires tidy-html5 version 5.1.25 or greater,
and might not work properly, or even compile with an earlier version.
Note that the latest tagged version is 5.1.25, but 5.1.26 and later
have some fixes for known issues that have been reported upstream.

duktape:
We used this javascript engine for several years, but we don't use it any more.
If you want to be able to build edbrowseduk, as a separate target,
then you need the duktape library. Distributors can skip this step.
If duktape is not part of your distribution, download from github, compile, and install.

git clone https://github.com/svaarala/duktape.git
cd duktape
#  build the version for distribution, without all the debugging features
make dist
make dist/source
cd dist/source
make -f Makefile.sharedlibrary
make -f Makefile.sharedlibrary install (as root)
make -f Makefile.cmdline
#  if you want the duktape shell
ln -s `pwd`/duk /usr/local/bin/duk (as root)
you may have to run ldconfig to access this new library.

quickjs:
This is the javascript engine for edbrowse.
It is probably not packaged.
git clone https://github.com/bellard/quickjs
cd quickjs
make
make install  # as root

This builds and installs a static library- libquickjs.a.
There is no shared library out of the box.
If you want to distribute a separate package for quickjs, you may wish to
edit the makefile and build a shared library instead.
By default, edbrowse links to the static library, and thus the
quickjs code is part of edbrowse, and a quickjs package is not necessary.

unixODBC:
If you want database access, you need unixODBC and unixODBC-devel.
Select the odbc option via:
make BUILD_EDBR_ODBC=on in the src directory, or from build,
cmake -DBUILD_EDBR_ODBC:BOOL=ON ..
ODBC has been very stable for a long time.
unixODBC version 2.2.14 seems to satisfy edbrowse with odbc.

------------------------------------------------------------

Compiling edbrowse:

cmake can be used to build and install edbrowse in Windows, linux, or Mac OS X.
visual C studio is assumed on Windows.
cd build
cmake .. [options]
make
Or, on linux, you can use the traditional makefiles.
cd src
make
Linux makefile supports the environment variables EBDEBUG=on for
symbolic debugging via gdb, and EBDEMIN=on for dynamic deminimization
of javascript within edbrowse - also for debugging.
Users and distributers will not need these options.

On Mac OS X, you'll need to install tidy-html5.  It's not in macports
yet.  Tell cmake about its location like this:
export TIDY_ROOT=/path/to/tidy-html5
cd build
cmake ..
make
Be careful here.  Apple distributes an ancient version of tidy-html.
You're going to have to take steps to insure that they do not collide.
Perhaps your best bet is to get edbrowse from MacPorts if possible.

------------------------------------------------------------

Edbrowse creates a system wide temp directory if it is not already present.
This is /tmp/.edbrowse in Unix, and $(TEMP)/edbrowse in Windows.
This directory contains a subdirectory per user, mod 700 for added security.
Thus one user cannot spy on the temp files, perhaps sensitive internet data,
of another user.
However, true multiuser security requires a root job at startup,
e.g. in /etc/rc.d/rc.local, to create the directory with the sticky bit.
	mkdir /tmp/.edbrowse
	chmod 1777 /tmp/.edbrowse

------------------------------------------------------------

The code in this project is indented via the script Lindent,
which is in the tools directory, and is taken from the Linux kernel source.
In other words, the indenting style is the same as the Linux kernel.
If you modify some source, you may want to run it through
../tools/Lindent before the commit.

In the interest of portability to Window Studio C, please keep variable
definitions at the top of each function or block.
A variable should not be defined after an executable statement.

------------------------------------------------------------

Debug levels:
0: silent
1: show the sizes of files and web pages as they are read and written
2: show the url as you call up a web page,
and http redirection.
3: javascript execution and errors.
   cookies, http codes, form data, and sql statements logged.
4: show the socket connections, and the http headers in and out.
   html syntax errors as per tidy5.
   side effects of running javascript.
   Dynamic node linkage.
5: messages to and from javascript, url resolution, tidy html nodes.
   Tree of nodes internal to edbrowse.
6: show javascript to be executed
7: reformatted regular expressions, breakline chunks,
JSValues allocated and freed.
8: text lines freed, debug garbage collection
9: not used

Casual users should not go beyond db2.
Even developers rarely go beyond db4.

------------------------------------------------------------

Sourcefiles as follows.

src/main.c:
Read and parse the config file.
Entry point.
Command line options.
Invoke mail client if mail options are present.
If run as an editor/browser, treat arguments as files or URLs
and read them into buffers.
Read commands from stdin and invoke them via the command
interpreter in buffers.c.
Handle interrupt.

src/buffers.c:
Manage all the text buffers.
Interpret the standard ed commands, move, copy, delete, substitute, etc.
Run the 2 letter commands, such as qt to quit.

src/stringfile.c:
Helper functions to manage memory, strings, files, directories.

src/isup.c:
Internet support routines.
Split a url into its components.
Decide if it's a proxy url.
Resolve relative url into absolute url
based on the location of the current web page.
Send and receive cookies.  Maintain the cookie jar.
Maintain a cache of http files.
Remember user and password for web pages that require authentication.
Only the basic method is implemented at this time.


src/format.c:
Arrange text into lines and paragraphs.
base64 encode and decode for email.
Convert utf8, iso8859-1, unicode 16, unicode 32, etc.

src/http.c:
Send the http request, and read the data from the web server.
Handles https connections as well,
and 301/302 redirection.
ftp, sftp, download files, possibly in the background.

src/html.c:
Manage the html tags and the tree of nodes.
Turn js side effects, like document.write or innerHTML,
back into html tags if that makes sense.
Submit/reset forms.
Render the tree of html nodes into a text buffer.
Rerender the tree after js has run, and report any changes to the user.

src/sendmail.c:
Send mail (smtp or smtps).  Encode attachments.

src/fetchmail.c:
Fetch mail (pop3 or pop3s or imap).  Decode attachments.
Browse mail files, separate mime components.
Delete emails, move emails to other imap folders, search on the imap server.

src/plugin.c:
Determine the mime type of a file or web page and the corresponding plugin,
if any. Launch the plugin automatically or on command.
A plugin can play the file, like music, or render the file, like pdf.

src/messages.h:
Symbolic constants for the warning/error messages of edbrowse.

src/messages.c:
International print routines to display the message according to your locale.

lang/msg-*:
Edbrowse status and error messages in various languages.
Each is converted into a const array of messages in src/msg-strings.c,
thus src/msg-strings.c is not a source file.

lang/ebrc-*:
Default .ebrc config file that is written to your home directory
if you have no such file.
Different files for different languages.
Each is converted into a const string in src/ebrc.c,
thus src/ebrc.c is not a source file.

src/decorate.c:
Decorate the tree with js objects corresponding to the html nodes
if js is enabled.

src/jseng-duk.c:
The javascript engine built around the duktape js library.
Manage all the js objects corresponding to the web page in edbrowse.
All the js details are hidden in this file.
this is encapsulation, hiding the js library from the rest of edbrowse.

src/jseng-quick.c:
The javascript engine built around the quick js library.
Manage all the js objects corresponding to the web page in edbrowse.

src/jseng-moz.cpp:
The javascript engine built around the mozilla js library.
Manage all the js objects corresponding to the web page in edbrowse.

src/js_hello*
Various hello world files to exercise various javascript engines.
These are stand alone programs; build them (on linux) by make hello.

src/startwindow.js:
Javascript that is run at the start of each session.
This creates certain classes and methods that client js will need.
It is converted into a const string in src/startwindow.c,
thus src/startwindow.c is not a source file.
As you write functions to support DOM,
your first preference is to write them in src/startwindow.js.
Failing this, write them in C, using the API presented by jseng-quick.c.
Failing this, and as a last resort, write them as native code within the js engine.
Obviously this last approach is not engine portable.

src/third.js:
Third party open source javascript routines that are used for debugging.
These are snapshots; you will need to update third.js, i.e. grab a new
snapshot, as this software evolves.
Distributers don't have to worry about this one,
it isn't compiled in unless $EBDEMIN is set to 1.

src/endwindow.js:
This is the close of startwindow.js, and it stands in if third.js is not used.

src/html-tidy.c:
Use tidy5 to parse html and return a tree of nodes.
This is another form of encapsulation.
We could, in the future, write html-foo.c, having the same interface,
if we prefer html parser foo instead.

src/jsrt:
This is the javascript regression test for edbrowse.
It exercises some of the javascript DOM interactions.
It also presents frames and hyperlinks and forms and input fields,
so you can play around.

src/acid3:
A snapshot of http://acid3.acidtests.org, with modifications,
so that some or all of the acid tests pass under edbrowse.
This is a work in progress.
My modifications are indicated by the comment   //@`

win32/dirent.c:
Access directories in Windows.

win32/vsprtf.c:
Windows implementation of asprintf().

src/dbops.c:
Database operations; insert update delete.

src/dbodbc.c:
Connect edbrowse to odbc.

src/dbinfx.ec:
Connect edbrowse directly to Informix.
Other connectors could be built, e.g. Oracle,
but it's probably easier just to go through odbc.

src/dbstubs.c:
Stubs for database functions, if you build edbrowse without database access.

------------------------------------------------------------

Error conventions.
Unix commands return 0 for ok and a negative number for a problem.
Some of my functions work this way, but most return
true for success and false for error.
The error message is left in a buffer, which you can see by typing h
in the /bin/ed style.
Sometimes the error is displayed no matter what,
like when you are reading or writing files.
error messages are created according to your locale, i.e. in your language,
if a translation is available.
Some error messages in the database world have not yet been internationalized.
Some are beyond my control, as they come from odbc or its native driver.

------------------------------------------------------------

Multiple Representations.

A web form asks for your name, and you type in Fred Flintstone.
This piece of data is part of your edbrowse buffer.
In this sense it is merely text.
You can make corrections with the substitute command, etc.
Yet it is also carried into the html tags in html.c,
so that it can be sent when you push the submit button.
This is a second copy of the data.
As if that weren't bad enough, I now need a third copy for javascript.
When js accesses form.fullname.value, it needs to find,
and in some cases change, the text that you entered.
These 3 representations are "separate but equal",
using a lot of software to keep them in sync.
Remember that an input field could be an entire text area,
i.e. the text in another editing session.
When you are in that session, composing your thoughts,
am I really going to take every textual change, every substitute,
every delete, every insert, every undo,
and map those changes over to the html tag that goes with this session,
and the js variable that goes with this session?
I don't think so.
When you submit the form, or run any javascript for any reason,
the text is carried into the html tag, under t->value, and into the js object,
to make sure everything is in sync before js runs.
This is accomplished by jSyncup() in html.c.
When js has run to completion, any changes it has made to the fields have
to be mapped back to the editor, where you can see them.
This is done by jSideEffects() in html.c.
In other words, any action that might in any way involve js
must begin with jSyncup() and end with jSideEffects().
Once this is done, the tree of tags is rerendered,
and the new buffer is compared with the old using a very simple diff algorithm.
Edbrowse tells you if any lines have changed.

Line 357 has been updated.

Such updates are only printed every 20 seconds or so, since some visual websites change data, down in the lower left corner, a dozen times a second,
and we don't need to see a continuous stream of update messages.
However, if you submit something and that changes the screen, you want to know about that right away.
Implementing all of this was not trivial!

------------------------------------------------------------

Some text is invisible as per css{display:none},
and some text only comes to light if you hover over something.
Edbrowse does not display this text, but sometimes edbrowse gets it wrong,
so if the website seems sparse, like you're missing something important,
use the showall command to reveal all of this text,
even some sections that might not be relevant to your situation.
Formerly invisible text looks like this.

`{
You are logged in as John Smith,
if you are not John Smith please <log out>.
}'

This block might be invisible unless you are actually logged in.
All text is displayed if javascript is disabled via the js- command,
because css doesn't run without javascript.

------------------------------------------------------------

Use the help command for a quick list of all the edbrowse commands.
This is a copy of the quick reference guide in usersguide.html.

------------------------------------------------------------

There is an in-built javascript dom debugger that you enter via the jdb command.
. or bye to exit.
Javascript expressions are evaluated, and the document objects are available.
document.head is the head of your document <head>,
document.body is the body <body>,
document.body.firstChild is the first node under <body>, and so on.
showscripts() shows all the javascripts, even those dynamically created.
Such debugging is perhaps beyond the scope of this README file.
Read the Debugging Javascript article in the edbrowse wiki.

About

Fork of CMB/edbrowse - A command-line editor and web browser.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • C 51.7%
  • JavaScript 28.7%
  • Perl 10.2%
  • C++ 5.7%
  • eC 1.8%
  • CMake 1.3%
  • Other 0.6%