Evil cursor model

18 February 2021 Updated 19 March 2021

I recently got enticed by Vim's text editing model, and began my own personal descent into evil-(mode): the full-featured Vim implementation within Emacs itself.1 1Viewed as a text-based application framework come operator system come Lisp machine, Emacs wins hands down. It's a pretty decent text editor, too. But Vim's combination of modal editing and, in particular, composable commands as a text editing language is compelling.

However, even viewed purely as a text editor, Vim doesn't get everything right in my not-so-humble opinion.

The cursor in a text editor indicates the location where the next editing operation should act, both visually to the user and internally to the text editor. There are two conceptually different ways to model the cursor location. You can consider the cursor to be located in between two consecutive characters. Or you can consider it to be located on top of a particular character. Emacs, like almost every other text editor, uses the first model. If the cursor2 2The "point", in Emacs terminology is at location 3, say, Emacs considers it to be between the 2nd and 3rd character in the text.3 3Emacs indexes cursor locations from 1 rather than 0; if the point is located just before the first character in the buffer, the value of point is equal to 1, and that is the minimum possble value it can take. Typing a character will insert it in between these two characters; deleting backwards will delete character 2; deleting forwards will delete character 3.

Vim also uses this cursor model in insert mode.4 4I'll use the Vim terminology here, but note that evil-mode calls modes states – so /insert state and normal state – to avoid confusion with Emacs' major- and minor-modes, which are something altogether different. But in normal mode, it uses the other cursor model: if the cursor is at location 3, in normal mode Vim considers it to be located on top of the 3rd character in the text. There are therefore two insersion commands: i will start inserting text just before the 3rd character (i.e. in between characters 2 and 3); a will start inserting text just after the 3rd character (i.e. in between characters 3 and 4). Similarly, p pastes text before the character under the cursor, whereas P pastes it after that character.

I think Vim got this one wrong.5 5I'm not the only one. Having to keep two different cursor models in mind adds cognitive burden, for little gain.

Although it might at first seem simpler to model cursor locations and character positions as the same thing, for text editing it turns out to be more natural to model cursor locations as being between characters. One of the principle text editing operation – inserting new characters – inherently occurs between two existing characters. The other principle editing operation – deleting characters – makes equal sense applied to the character before or the character after the current location. Hence all text editors (Vim and Emacs included) having separate forwards and backwards deletion operations. Some text editing operations – e.g. replacing a character – do make sense applied to the character under the cursor. But they arguably also make sense applied to the characters before or after the cursor. (How often have you found yourself wanting to replace the last character you typed?)

The unnatural normal-mode cursor model infects Vim in myriad small ways. For example, exiting insert mode moves the cursor one step backwards, to place it over the previous character, because you're more likely to want to replace or delete the last character you inserted than the character after. But this means entering and immediately exiting insert mode is not side-effect-free: it moves the cursor backwards one character. There are plenty of stackexchange questions asking about this, and even a specific evil-mode setting to disable it: evil-move-cursor-back. Another example is that you can't position the cursor after the final character of a line. So you have to use the a command instead of the i command when inserting text at the end of the line, and there's no way to use the normal x deletion command to delete a newline character.6 6Yes, there are the highly useful J and A commands. But the fact that certain standard editing commands fail to work in certain locations is a sign of the model not quite fitting. This becomes more awkward still when navigating by screen lines rather than logical lines.7 7Cf. visual-line-mode in Emacs, or :set wrap in Vim. This is sufficiently annoying to sufficiently many people, that there are specific Vim and evil-mode settings to alter it: set virtualedit=onemore and evil-move-beyond-eol, respectively.

Vimmers will argue that having two insertion commands allows more precise control over exactly where you want new text to be inserted. Anyway, surely two insertion commands can only ever be better than one? But that argument is too simplistic. It's only true if you frequently find yourself in a situation where both of those commands are potentially useful. If only one of those commands is likely to be useful in any given situation, a single command that always does the right thing is better. It avoids the cognitive burden of having to decide which command to use.

In the case of text insertion, it's rare that you're equally likely to want to insert text before and after a particular character. If the cursor is on the first character of a word, you're far more likely to want to insert new text before the beginning of the word, than in between the first and second character of the word. And conversely, when the cursor is on the last character of a word, you're more likely to want to insert new characters after that word (probably a space followed by the next word), rather than inserting characters before the final character of that word. Similarly for the beginnings and ends of sentences, paragraphs, code blocks, etc. These are particularly significant locations, because seasoned Vimmers (and Emacsers) more often navigate horizontally by entire words, sentences, or paragraphs using w, b, e, ), } etc. (or their Emacs equivalents forward-word, backward-word, forward-sentence, forward-paragraph etc.), rather than navigating by single characters using h and l. And if you need to insert text in the middle of a word, it's as easy to navigate to the character immediately after where you want to insert the text and hit i, as it is to navigate to the character immediately before and hit a. So in this case, the two insertion commands are largely redundant.

Experienced Vimmers describe there being no barrier between thought and editing. I have no doubt it's possible to internalise the two different cursor models in normal and insert mode, and reach a level of fluency where this small cognitive burden vanishes. But that's not an argument in favour of it.8 8Backwards compatibility is a good reason to stick to the standard Vim model. I don't advocate fixing Vim's cursor model unless, like me, you don't care about being able to use vanilla Vi(m). Doing away with the less natural normal-mode cursor model, and making both normal mode and insert mode use the cursor-between-characters model, consolidates on a single, more natural model.

There's a small bonus. Replacing Vim's normal-mode cursor-on-characters model with the standard cursor-between-characters model frees up the precious, single-character, normal-mode keys a and P for other things. Single-character keys are like gold dust in Vim normal mode, especially unshifted ones. (On the other hand, the standard Vim xp combination for swapping two characters no longer works in the cursor-between-characters model, so you may need to use up a key on that.)

TL;DR: It's possible to replace evil-mode's cursor model with the cursor-between-characters model, with a modicum of Elisp. The code is collected here: evil-cursor-model.el.

Read on if you're interested in how this is done (or skip to the conclusion if not)…

Hacking the Evil cursor model

evil-mode aims to be a fully faithful implementation of Vim, so it goes to great lengths throughout the evil-mode code base to implement Vim's normal-mode cursor-on-characters model on top of Emacs' native cursor-between-characters model. Changing something so fundamental to evil-mode can't be done by just tweaking some configuration variables; it requires some Elisp. But I was pleasantly surprised to find it can be done without rewriting large parts of evil-mode. Redefining/advising a handful of key evil-mode functions is all it takes to completely replace the cursor model. A testament to the power and flexibility of having a Vim implementation written entirely in Elisp!

First off, there are a few evil-mode configuration variables that need setting to be consistent with the cursor-between-characters model:

(setq evil-move-cursor-back nil)
(setq evil-move-beyond-eol t)
(setq evil-highlight-closing-paren-at-point-states nil)

The first disables moving the cursor back one step when exiting insert mode, which doesn't make sense in the cursor-between-characters model. The second allows the cursor to be positioned after the final character on a line, which makes complete sense in the cursor-between-characters model. The third makes paren highlighting behave as in vanilla Emacs, which is the behaviour we want since Emacs also uses the cursor-between-characters model.

Unsurprisingly, the main changes concern how motion commands position the cursor, and where certain other commands act in relation to the cursor location. Emacs uses the cursor-between-characters model, so point always indicates a buffer location between two characters. evil-mode emulates Vim's cursor-at-character model on top of Emacs, by treating the cursor as being on top of the character after point. Therefore, Vim normal-mode commands that act before the cursor, like i, will generally do what we want. Whereas commands that act after the cursor, like a, will either be redundant and are freed up to be used for something else, or will need rewriting.

Some normal-mode commands can be switched over to the new cursor model simply by rebinding them to the existing normal-mode commands that act on the character before the cursor. For example, in the cursor-between-characters model, we always want to paste text at point (i.e. we want to paste the new text in between the two characters that point is between). In the cursor-at-character model, this is the location before the cursor. So we can get the behaviour we want by rebinding p to evil-paste-before. The P command becomes redundant, and is freed up to be used for something else:

(evil-define-key 'normal 'global "p" #'evil-paste-before)

t and f

More interesting is Vim's t motion command (and its backwards variant T). t moves forwards to the next occurrence of a character, leaving the cursor located on top of the character before the one being searched for. In the cursor-between-characters model, should we position the cursor/point before or after the character the cursor lands on in Vim? A little thought makes it clear we should position it after that character, i.e. immediately before the character being searched for. This is the natural meaning of moving to a character in the cursor-between-characters model. But more importantly, it's more useful.

The most common uses of t in Vim are to search for a character, and then insert text just before it with a.9 9An example of why having both i and a commands in Vim is not as useful as you might think. It's rare that you need to insert text before the character before the one you searched for, which is what ti allows. This is the only thing we've lost here in the cursor-between-characters model. Or rather, not lost, but requires one additional keypress: thi. Or, in combination with d or c, to delete/change the text up to but not including the character. In the cursor-between-characters model, both of these uses are achieved by locating the cursor/point just before the character you searched for.

Since evil-mode interprets point before a character as the cursor being on top of that character, this is in fact exactly what the f command does in evil-mode. So we get the behaviour we want by rebinding t to evil-find-char (the function that f is bound to by default).

Similarly, T moves backwards to the previous occurrence of a character, which in the cursor-between-characters model means positioning the cursor/point just after that character. But this is exactly what the T command already does in evil-mode, so we don't need to rebind it. We rebind it here anyway to the function evil-find-char-to-backward it's usually bound to, just for completeness (and in case something else has rebound it from the evil-mode default).

(evil-define-key 'motion 'global "t" #'evil-find-char)
(evil-define-key 'motion 'global "T" #'evil-find-char-to-backward)

Vim's f command moves forwards to the next occurrence of a character, leaving the cursor located on top of it. The most common use of this is to insert text either immediately before or immediately after the character you just searched for, using i or a. Or to delete text up to and including that character. In the cursor-between-characters model, inserting before is already covered by ti. The other two common cases are enabled by locating the cursor/point immediately after the character being searched for.

There are no existing evil-mode commands that achieve this, so we have to write our own motion commands this time. However, all we need to do is to search for the character exactly as in evil-mode, but then move point one more character forwards (or backwards, if we're searching backwards).

There's one edge-case, though: if point is already at the character we're searching for, we want to search for the next occurrence. Otherwise we don't move the cursor at all, which isn't very useful (and isn't how Vim behaves). We account for this special case by first moving point one character forwards/backwards before searching. Updated 19 March 2021: The original version didn't set evil-last-find, so evil-repeat-find-char didn't work. The new version below fixes this.

(evil-define-motion evil-find-char-after (count char)
  "Move point immediately after the next COUNT'th occurrence of CHAR.
Movement is restricted to the current line unless `evil-cross-lines' is non-nil."
  :type inclusive
  (interactive "<c><C>")
  (unless count (setq count 1))
  (if (< count 0)
      (evil-find-char-backward (- count) char)
    (when (= (char-after) char)
      (forward-char)
      (cl-decf count))
    (evil-find-char count char)
    (forward-char))
  (setq evil-last-find (list #'evil-find-char-after char (> count 0))))

Similarly, Vim's F command moves backwards to the character, leaving the cursor on top of that character. In the cursor-between-characters mode, this means locating the cursor/point before the character. But once again, this is exactly what the F command already does.

Now we've defined the necessary motion commands, we can rebind f. (We also rebind F to what it's usually bound to, for completeness.)

(evil-define-key 'motion 'global "f" #'evil-find-char-after)
(evil-define-key 'motion 'global "F" #'evil-find-char-backward)

Word-wise (and WORD-wise) motion commands

The other group of motion commands we need to redefine are those that move by words or WORDS.10 10WORDS in Vim are any whitespace-delimited sequence of characters. Standard words are delimeted by any non-word-constituent character. (Motion by sentences already puts the cursor on top of the first character of the sentence, i.e. it puts point immediately before the first character of the sentence. Motion by paragraphs moves linewise, which isn't affected by the cursor model.)

We first define two new helper functions, analogous to evil-mode's evil-forward-end and evil-backward-end, but moving point immediately after/before the final/first character of whatever thing we're moving by, rather then one character before/after it. Since those functions are implemented using Emacs' built-in thing-at-point features which assume the cursor-between-characters model, this actually simplifies the implementation a little compared to the evil-mode versions. (We define new functions, rather than redefining of advising the original ones, in case any other commands call the evil-mode versions and expect the original behaviour.)

(defun evil-forward-after-end (thing &optional count)
  "Move forward to end of THING.
The motion is repeated COUNT times."
  (setq count (or count 1))
  (cond
   ((> count 0)
    (forward-thing thing count))
   (t
    (unless (bobp) (forward-char -1))
    (let ((bnd (bounds-of-thing-at-point thing))
          rest)
      (when bnd
        (cond
         ((< (point) (cdr bnd)) (goto-char (car bnd)))
         ((= (point) (cdr bnd)) (cl-incf count))))
      (condition-case nil
          (when (zerop 
                 (setq rest 
                       (forward-thing thing count)))
            (end-of-thing thing))
        (error))
      rest))))

(defun evil-backward-after-end (thing &optional count)
  "Move backward to end of THING.
The motion is repeated COUNT times. This is the same as 
calling `evil-forward-after-word-end' with -COUNT."
  (evil-forward-after-end thing (- (or count 1))))

We define the four new motion commands (forward/backward by word/WORD) in terms of these helper functions:

(evil-define-motion evil-forward-after-word-end (count &optional bigword)
  "Move the cursor to the end of the COUNT-th next word.
If BIGWORD is non-nil, move by WORDS."
  :type inclusive
  (let ((thing (if bigword 'evil-WORD 'evil-word))
        (count (or count 1)))
    (evil-signal-at-bob-or-eob count)
    (evil-forward-after-end thing count)))

(evil-define-motion evil-backward-after-word-end (count &optional bigword)
  "Move the cursor to the end of the COUNT-th previous word.
If BIGWORD is non-nil, move by WORDS."
  :type inclusive
  (let ((thing (if bigword 'evil-WORD 'evil-word)))
    (evil-signal-at-bob-or-eob (- (or count 1)))
    (evil-backward-after-end thing count)))

(evil-define-motion evil-forward-after-WORD-end (count)
  "Move the cursor to the end of the COUNT-th next WORD."
  :type inclusive
  (evil-forward-after-word-end count t))

(evil-define-motion evil-backward-after-WORD-end (count)
  "Move the cursor to the end of the COUNT-th previous WORD."
  :type inclusive
  (evil-backward-after-word-end count t))

And now we can rebind the traditional Vim commands to our new cursor-between-characters versions:

(evil-define-key 'motion 'global "e"  #'evil-forward-after-word-end)
(evil-define-key 'motion 'global "E"  #'evil-forward-after-WORD-end)
(evil-define-key 'motion 'global "ge" #'evil-backward-after-word-end)
(evil-define-key 'motion 'global "gE" #'evil-backward-after-WORD-end)

Vim has many other types of motion commands. But only motion by characters is directly affected by the change to the cursor model. Commands for motion by whole lines, or even larger textual units, don't need rebinding.

Motion and Text objects

Vim's text objects are one of its great strengths. They allow normal-mode commands to act on a textual element the cursor is currently located within, such as the current word, sentence, paragraph, parenthesised expression, HTML tag, etc.

evil-mode implements motions and text objects with the help of "types": collections of functions that manipulate pairs of cursor positions in various ways, to return an appropriately expanded pair of buffer locations. (And also to translate them to the native Emacs model, to point locations.) For example, the line type expands the buffer locations until they span complete lines.

Evil isn't called the "Extensible Vi Layer" for nothing! The fact that evil-mode is implemented in terms of an extensible underlying framework makes it much easier to hack on. In this case, it turns out we can make all text objects and motions work correctly with the cursor-between-characters model by redefining a single "type".

Like line-wise motion commands, the line-oriented types don't need any modification. But the character-oriented "types" do. The standard character-oriented "types" are exclusive and inclusive.

In the cursor-at-characters model, exclusive returns a pair of buffer locations excluding the character that the second cursor location is on top of. Since evil-mode considers the cursor to be on top of a character if point is immediately before that character, this means exclusive usually returns the locations it's passed unchanged (with some exceptions around beginnings and ends of lines). In the cursor-between-character model, when we want to specify a range of text up to but not including a character, we simply specify the end of the range to be the buffer location immediately before that character. This happens to match what exclusive returns anyway. So we can leave the exclusive type unchanged.

The inclusive type returns a pair of buffer locations including the character that the second cursor location is on top of. But in the cursor-between-characters model, when we want to specify a range of text up to and including a character, we simply specify the end of the range to be the buffer location immediately after that character. I.e. in the cursor-between-characters model, there's no distinction between inclusive and exclusive ranges of characters: ranges always unambiguously include precisely the characters located in between the two cursor locations. Therefore, we need to redefine evil-mode's inclusive type to be identical to the exclusive type:

(evil-define-type inclusive
  "Return the positions unchanged, with some exceptions.
If the end position is at the beginning of a line, and the
beginning position is at or before the first non-blank
character on the line, return `line' (expanded)."
  :expand (lambda (beg end) (evil-range beg end))
  :contract (lambda (beg end) (evil-range beg end))
  :normalize (lambda (beg end)
               (cond
                ((progn
                   (goto-char end)
                   (and (/= beg end) (bolp)))
                 (setq end (max beg (1- end)))
                 (cond
                  ((progn
                     (goto-char beg)
                     (looking-back
                      "^[ \f\t\v]*"
                      (line-beginning-position)))
                   (evil-expand beg end 'line))
                  (t
                   (unless evil-cross-lines
                     (setq end (max beg (1- end))))
                   (evil-expand beg end 'inclusive))))
                (t
                 (evil-range beg end))))
  :string (lambda (beg end)
            (let ((width (- end beg)))
              (format "%s character%s" width
                      (if (= width 1) "" "s")))))

With this one redefinition, all evil-mode operators and text objects behave correctly in the cursor-between-characters model!

Search offsets

This is a subtle one. In Vim's regexp searches, invoked with / or ?, you can put additional switches after the regexp (separated from the regexp by a "/") to modify the behaviour of the search. The "e" switch puts the cursor at the end of the matching text, rather than at the beginning. This is a minor efficiency saving if you're just using / to move the cursor. But it becomes highly useful when you're using / with an operator, e.g. d or c to delete or change a range of text. It lets you specify that you want to delete or change the text up-to-and-including the matching text, instead of up to but not including the match.

In Vim's cursor-at-characters model, this means putting the cursor on top of the final character of the matching text, corresponding to point immediately before that character. In the cursor-between-characters model, this means putting the cursor/point immediately after that character, i.e. one character later.

The cursor location adjustment is performed by the evil-ex-search-goto-offset function. To leave point in the correct location for the cursor-between-characters model, we add :after advice to this function so that, when the "e" switch is supplied, it moves point one character forward.

(defun ad-evil-ex-search-adjust-offset (offset)(asfdasF)
  (unless (zerop (length offset))
    (save-match-data
      (string-match
       "^\\([esb]\\)?\\(\\([-+]\\)?\\([0-9]*\\)\\)$"
       offset)
      (when (and (= (aref offset (match-beginning 1)) ?e)
                 (not (bobp)))
        (forward-char 1)))))

(advice-add 'evil-ex-search-goto-offset :after #'ad-evil-ex-search-adjust-offset)

Similarly, Vim's % command jumps between matching delimeters. In the cursor-at-characters model, this means moving the cursor on top of the matching delimeter, i.e. moving point immediately before the delimeter. In the cursor-between-characters, moving this is fine when jumping backwards. But when jumping forwards, we want to move the cursor immediately after the matching delimeter, i.e. one character further forwards.

There's an ambiguous corner case to deal with, when the cursor is in between a closing delimeter and an opening delimeter.11 11This is a rare case where the cursor-at-characters model simplifies things: there's no ambiguity about which delimeter the cursor is at, when the cursor is considered to be on top of the delimeter. We handle this case by following Emacs' lead: we jump to the delimeter that Emacs' show-paren-mode highlights in the same situation.

Updated 28 February 2021: The original version of this led to an infinite recursion if evil-jump-item called itself recursively, which it occasionally does. The new version below fixes this.

We change the behaviour of % by defining a new command which adjusts the point location before and after calling evil-jump-item to make it do the right thing in the cursor-between-characters model, and binding it to %.

(evil-define-motion evil-jump-item-before (count)
  "Find the next item in this line immediately before
or somewhere after the cursor and jump to the corresponding one."
  :jump t
  :type inclusive
  (let ((pos (point)))
    (unless (or (bolp) (bobp)) (backward-char))
    (condition-case nil
        (evil-jump-item count)
      ('user-error (goto-char pos)))
    (unless (< (point) pos)
      (goto-char pos)
      (evil-jump-item count)
      (when (> (point) pos) (forward-char)))))

(evil-define-key 'motion 'global #'evil-jump-item-before)

Roundup

All the above code is collected here, for convenience: evil-cursor-model.el. This isn't a well-crafted package as yet, just a bunch of Elisp. If you paste it into your .emacs, you need to be sure to put it after you load the Evil package for it to work.

I've been using this modified cursor model in evil-mode for many months now, and it seems to work perfectly. But there could be some Vim or evil-mode commands I don't use yet, which still need tweaking to work correctly. If so, let me know in the comments, below.

The fact that such a fundamental aspect of an editor – namely, the way its cursor model works – can be comprehensively changed with a just few lines of code, is thanks to the fact that Emacs is almost entirely written in a Lisp, Lisp systems are so easy to modify and update on the fly, and the Evil package was designed as an extensible framework for constructing a Vim emulation within Emacs.

Try doing that in Vimscript12 12I wouldn't be at all surprised if you can do it in Vimscript. Never underestimate the ingenuity of hackers. :)

Leave a comment

All comments are moderated. Clicking submit will open your email client and let you send your comment by email. By submitting your comment you agree to license the content under a Creative Commons Attribution-ShareAlike 4.0 International License.




Creative Commons License