Saturday, February 14, 2009

Mysql 5.1 and SphinxSearch on Ubuntu 8.10

Sphinx is a binary flatfile search index daemon with uses a mysql (and now postgres) plug in so that it can behave much like a single table. Because the sphinx query engine is built for speed it can return queries in less than a second off of billions of rows of text data, a thousand or more times faster than a mysql Fullindex index.

Here's my runthrough of the HOWTO on my Ubuntu 8.10 dev host.

Prepare your install
make sure you have installed the following packages for 5.1.x:
apt-get -y install g++ libstdc++6 libncurses5-dev libtool automake1.9 libexpat1-dev

Fetch the Source.
Get Sphinx Source. The latest - today it was 0.9.8.1.
Get mysql source. This post deals with mysql 5.1.32. the latest and greatest, but mostly because 5.0.75 gives install errors and won't build when sphinx is added.

Build MySQL.
mysql has to be built first. But you have to fetch both the mysql and sphinx source to get started.

1) run BUILD/autorun.sh from the main mysql directory to incorporate the changes into the configure and make utilities:

./BUILD/autorun.sh

2) copy the files from sphinx's sphinxse directory to a new directory storage/sphinx.

mkdir storage/sphinx
cp ../sphinx0.9.9-rc1/sphinxse/* storage/sphinx

3) configure the build
there are lots of options here which you can see by running ./configure --help, or see the configuration documentation.

for 5.1.x:
./configure --with-plugins=sphinx --enable-assembler --with-extra-charsets=complex --enable-thread-safe-client --with-readline --with-big-tables --with-mysqld-ldflags=-all-static --enable-local-infile --with-charset=utf8 --enable-thread-safe-client --disable-shared --prefix=/usr/local/mysql --localstatedir=/data --with-mysqld-user=mysql --exec-prefix=/usr/local/mysql CXX=gcc

for 5.0.x:
./configure --with-sphinx-storage-engine --enable-assembler --with-extra-charsets=complex --enable-thread-safe-client --with-readline --with-big-tables --with-mysqld-ldflags=-all-static --enable-local-infile --with-charset=utf8 --enable-thread-safe-client --disable-shared --prefix=/usr/local/mysql --localstatedir=/data --with-mysqld-user=mysql --exec-prefix=/usr/local/mysql CXX=gcc

Most of these flags are recommended in the Sphinx docs. I add the last four, mostly explicit paths for data and installation because linux has problems properly setting up without them.

4) compile and install:
make
sudo make install


5) setup mysql
link to instructions for installing mysql 5.1.x from build.

5.1) create the mysql configuration file

cp [mysqlsrcdir]/support-files/my-medium.cnf /etc/mysql/my.cnf
chmod 444 /etc/mysql/my.cnf


this file will create a conflict where there is more than one mysql version installed at once, different versions of mysql are not compatible in .cnf files.

The conf file may have to be edited, esp if this is a second mysql server.

cd /usr/local/
sudo su
mkdir /data # this is the mysql data directory
chown mysql:mysql /data
chown -R mysql:mysql mysql
chmod 444
./bin/mysql_install_db --user=mysql --basedir=/usr/local/mysql --datadir=/data


# test the install - can be informative - all tests should pass...

cd mysql-test
perl mysql-test-run.pl


# run the install
/usr/local/mysql/bin/mysqld_safe &
/usr/local/mysql/bin/mysql # add this bin to your path to use this install or type the path explicitly
mysql> show engines;


Known Bug: you may have to comment out this line in /etc/mysql/my.cnf if --skip-federated is not recognized by mysqld on startup:

#skip-federated

And change the location of mysql.sock to the location that sphinx expects.

Should see this line in mysql's response:
+------------+---------+-------------------------------------------+
| Engine | Support |Comment |


...

| SPHINX | YES | Sphinx storage engine 0.9.8.1


Set up the user
mysql -uroot -p
mysql> create user 'ron'@'%';
mysql>grant all on *.* to 'ron'@'%';
mysql>set password for 'ron'@'%' = PASSWORD('mypass');

In order to make my build of mysql work with sphinx I had to modify the my.cnf file so that the mysql.sock file is in /var/run/mysqld/myqld.sock where sphinx expects it to be. Also some defaults require that the bind-address be changed from 127.0.0.1 to your real ip address.

Build Sphinx
Building sphinx works best when you specify where everything is:

./configure --prefix=/usr/local/sphinx --with-mysql \
--with-mysql-libs=/usr/local/mysql/lib/mysql \
--with-mysql-includes=/usr/local/mysql/include/mysql

I experimented with --with-libstemmer, it requires you to download more code and add it to the source so I didn't get very far.

make
sudo make install
chown -R mysql:mysql /usr/local/sphinx


since this is a /usr/local install, add the mysql bin to my path: (I put this in my .bashrc)


export PATH=${PATH}:/usr/local/mysql/bin

Test Sphinx (sample):
>cd /usr/local/sphinx/etc
>cp sphinx.conf.dist sphinx.conf
>mysql
/usr/local/sphinx/bin/indexer test1

Sphinx 0.9.8.1-release (r1533)
Copyright (c) 2001-2008, Andrew Aksyonoff

using config file '/usr/local/sphinx/etc/sphinx.conf'...
indexing index 'test1'...
collected 4 docs, 0.0 MB
sorted 0.0 Mhits, 100.0% done
total 4 docs, 193 bytes
total 0.010 sec, 19300.00 bytes/sec, 400.00 docs/sec


>/usr/local/sphinx/search test

Sphinx 0.9.8.1-release (r1533)
Copyright (c) 2001-2008, Andrew Aksyonoff

using config file '/usr/local/sphinx/etc/sphinx.conf'...
index 'test1': query 'test ': returned 3 matches of 3 total in 0.000 sec

displaying matches:
1. document=1, weight=2, group_id=1, date_added=Tue Mar 10 22:47:23 2009
id=1
group_id=1
group_id2=5
date_added=2009-03-10 22:47:23
title=test one
content=this is my test document number one. also checking search within phrases.
2. document=2, weight=2, group_id=1, date_added=Tue Mar 10 22:47:23 2009
id=2
group_id=1
group_id2=6
date_added=2009-03-10 22:47:23
title=test two
content=this is my test document number two
3. document=4, weight=1, group_id=2, date_added=Tue Mar 10 22:47:23 2009
id=4
group_id=2
group_id2=8
date_added=2009-03-10 22:47:23
title=doc number four
content=this is to test groups

words:
1. 'test': 3 documents, 5 hits

index 'test1stemmed': search error: failed to open /usr/local/sphinx/var/data/test1stemmed.sph: No such file or directory.


Next: build an application index:
Now time to build a my first massive index off of another table in the database. Posting next...

No comments: