Git search and replace
Here is a technique for doing a search-and-replace inside a git repository. I find myself doing this a few times a year and it's rare enough that I alwasy forget the exact commands. So I thought I'd write it down here for posterity.
TL;DR
git grep -l SEARCH | xargs sed -i s/SEARCH/REPLACE/g
If you want more details, keep reading.
The problem
Say you're doing a refactor in your code, or you some major overhaul of your documentation, and you find yourself wanting to do a search and replace across all files in your repository.
Let's say for example that you've been hard at work developing a cool new web framework called
jingo (it stands for "javascript inspired network graph object" in case you were wondering). One
day you realize with horror that the name you've chosen is awfully similar to another popular framework. So
just to be safe, you decide to rename your project and change its name to flosk ("forward logic
object software kit"). Now it's time to update all your code and documentation!
My solution
I imagine that today's fancy text editors have built-in ways to do these kinds of refactors, but personally I don't use a fancy text editor and I think it's useful to know how to do this "by hand".
My solution is to break the problem into two parts. First, find all the occurences of the text I want to
remove (jingo in my example from earlier), then replace that text with the new one
(flosk).
Search with git grep ...
The classic tool for finding text in a series of files is grep, but in this case I will use
git grep because it has some useful defaults when dealing with a git repository. In theory all
the examples should work using plain grep but might require a bit of extra work.
git grep jingo
This will find all occurences of jingo in any file inside the git repository, and show (by
default) the name of the file and the line that contains the text.
I my case I don't want to see the match, just the list of files. That's what the
--files-with-match flag is for (-l for short):
git grep -l jingo
Replace with sed
The classic tool for doing search-and-replace inside a file is the unix utility sed. It's a
powerful tool but it can also be a little tricky to use. It has its own mini language to write commands that
modify the file you give it in different ways. It calls those commands "expressions", and the first letter of
the expression tells you what kind of operation is done. In our case we will use the s command
which stands for "substitute". I have also used the d command in the past for deleting specific
lines.
sed 's/jingo/flosk/g' README.txt
By default, sed will not modify any files you give it. Instead it applies the command you gave it
to the file, and shows the modified version leaving the original file intact. For our use case we actually
want to modfiy the file, so we need to give sed the --in-place option (-i
for short):
sed -i 's/jingo/flosk/g' README.txt
Putting it together with xargs
We're almost there! With git grep ... we can find the files, and with sed ... we can
search/replace inside one file. All we need is to "glue" the two together and run the sed command
on each file. One way I like to do it is using the xargs command:
git grep -l jingo | xargs sed -i 's/jingo/flosk/g'
Going a bit further
Dealing with files containing spaces in their names
Because of the way xargs works, the command I've shown will break if any of the matched files
contains a space in their name. For example if the file is called READ ME.txt then
xargs will tell sed to open two files: READ and ME.txt.
One solution for this is to tell git grep to use a special character to separate the file names
(instead of putting one file name per line), and to tell xargs to use that same character for
splitting the list of files names back into individual ones before passing them to sed. The
character used is called a "null byte" and is normally not allowed in file names.
The option is called --null for both git grep and xargs, but
unfortunately its short version is not the same with both tools: git grep uses
-z while xargs uses -0 (a zero).
Here's how our example looks like now:
git grep -lz jingo | xargs -0 sed -i 's/jingo/flosk/g'
Restricting the target files
Because we use git grep, we can use its full power when it comes to targetting specific files
(what git calls "pathspecs"). For example let's say we only want to target python files:
git grep -l jingo -- '*.py' | xargs sed -i 's/jingo/flosk/g'
More complex search/replace with regular expressions
So far we've used fixed strings for both search and replace, but sometimes you need a bit of extra power.
Luckily both git and sed have support for regular expressions: git has --extended-regexp while
sed has --regexp-extended. It's a bit annoying that they use different names, but luckily the
short name of the flag is the same for both: -E.
As an example, here is how you can replace all instances of both jingo and
JINGO with flosk:
git grep -l -E '(jingo|JINGO)' | xargs sed -i -E 's/(jingo|JINGO)/flosk/g'
To use capture groups in the replacement sed command, we can use \1 (and \2, ... if
there is more than one capture group):
git grep -l -E '(jingo|JINGO)' | xargs sed -i -E 's/(jingo|JINGO)/the framework formerly known as \1/g'
I hope that helps, happy refactoring!