Archipelago-deployment-live: Upgrading Solr
What is this documentation for?
This documentation will help you ugprade your Solr Index from 8.x to 9.x or inside 9.x releases. And is meant to be a guide/helper. There is no simple way of saying this, but because the way a Solr index (sort of a Binary tree) is build, any larger change in the schema, field type definitions requires either a complete reindex but really, most of the time, a wipe, and start fresh situation. There is no perfect way around. There are very complex ways of keeping an old Server running and serving searches while you re-index a new one, but honestly, the how and approach will depend on your existing knowledge of Solr, your skills (even memory!) to execute so, and documenting those hacks are beyond the scope of this documentation. What is proven is what we explain in this document
Requirements
- An archipelago-deployment-live instance (working, tested) deployed using provided instructions via Docker running either Solr 8.x or 9.x
- Good knowledge, patience and instincts (+ courage and time) on how to run Terminal Commands.
- Patience(again but also patience from your users since search will be unavailable until you reindex). You can't skip steps here.
- For shell Commands documented here please copy line by line--not the whole block.
- You are running already version control and know how to git pull/push/merge.
Backing up and preparing for the upgrade
Backups are always going to be your best friends. Archipelago's code, database, and settings are mostly self-contained in your current archipelago-deployment-live
repo folder, and backing up is simple because of that.
Step 1:
To make upgrading simpler we will clone archipelago-deployment-live (empty one) into a different folder. That way we can copy complete folders of configs and files instead of fetching them from github one by one.
Go to your home folder (for the sake of this documentation it will be /home/ec2-user
but you can also use the $HOME
environmental variable instead )
cd /home/ec2-user
git clone https://github.com/esmero/archipelago-deployment-live archipelago-deployment-live-1.4.0
cd archipelago-deployment-live-1.4.0
git switch 1.4.0
Now, on a terminal, cd
into your running archipelago-deployment-live
folder, then cd
inside the deploy/ec2-docker
subfolders and shut down your docker-compose
ensemble by running the following:
docker-compose down
Step 2:
Verify that all containers are actually down. The following command should return an empty listing:
docker ps
If anything is still running, wait a little longer and run the command again.
Step 3:
Now let's tar.gz
the whole ensemble with data and configs. We will exclude here the local source caches generated by Cantaloupe. If these or not exist will depend on how custom your
deployment is.
As an example we will save this into your $HOME
folder. As a good practice we append the current date (YEAR-MONTH-DAY) to the filename. Here we assume today is July 7th of 2024.
We will cd
back to the parent folder of your running archipelago-deployment-live
folder, so three levels down, assuming you are right now inside archipelago-deployment-live/deploy/ec2-docker
cd ../../..
sudo tar --exclude=archipelago-deployment-live/data_storage/iiifcache --exclude=archipelago-deployment-live/data_storage/iiiftmp -czvpf $HOME/archipelago-deployment-D10-20240707.tar.gz archipelago-deployment-live
The process may take a few minutes. Now let's verify that all is there and that the tar.gz
is not corrupt.
tar -tvvf $HOME/archipelago-deployment-D10-20240707.tar.gz
You will see a listing of files, and at the end you will see something like this: Archive Format: POSIX pax interchange format, Compression: gzip
. If corrupt (Do you have enough space? Did your ssh connection drop?) you will see the following:
tar: Unrecognized archive format
Step 4:
cd
again into your running archipelago-deployment-live
folder, then cd
inside the deploy/ec2-docker
Restart your docker-compose
ensemble, and wait a little while for all to start.
docker-compose up -d
Step 5:
Export/backup all of your live Archipelago 1.3.0, Drupal 10 configurations (this allows you to compare/come back in case you lose something custom during the upgrade).
docker exec esmero-php mkdir config/backup
docker exec esmero-php drush cex --destination=/var/www/html/config/backup
Good. Now it's safe to begin the upgrade process.
Upgrading to Solr 9.2.1
Step 0: Get familiar with what changed.
Running a Production Server requires some informed decision making and thus, we believe, a good pre-step is reviewing what changed between releases. In specific focus on this folder.
https://github.com/esmero/archipelago-deployment-live/tree/1.4.0/config_storage/solrconfig/conf
and
https://github.com/esmero/archipelago-deployment-live/tree/1.4.0/data_storage/solrlib
Also (please) read the official documentation here https://solr.apache.org/guide/8_9/solr-upgrade-notes.html
Step 1: Edit docker-composer.yml
You want to replace your current Solr Service (in its enterity. Please make sure indendation is 1:1).
solr:
container_name: esmero-solr
restart: always
image: "solr:9.2.1"
# If running Docker < 20.10.10 please uncomment the following lines
# See https://solr.apache.org/guide/solr/latest/upgrade-notes/major-changes-in-solr-9.html#solr-9-2
#security_opt:
# - seccomp:unconfined
tty: true
environment:
SOLR_HEAP: 1024m
SOLR_OPTS: -Dsolr.jetty.request.header.size=65535 -Dsolr.modules=scripting
ports:
- "8983:8983"
networks:
- host-net
- esmero-net
volumes:
- ${ARCHIPELAGO_ROOT}/data_storage/solrcore:/var/solr/data
- ${ARCHIPELAGO_ROOT}/config_storage/solrconfig:/drupalconfig
- ${ARCHIPELAGO_ROOT}/data_storage/solrlib:/opt/solr/contrib/archipelago/lib
entrypoint:
- docker-entrypoint.sh
- solr-precreate
- drupal
- /drupalconfig
You can also use any of these as reference:
- https://github.com/esmero/archipelago-deployment-live/blob/1.4.0/deploy/ec2-docker/docker-compose-aws-s3-arm64.yml
- https://github.com/esmero/archipelago-deployment-live/blob/1.4.0/deploy/ec2-docker/docker-compose-aws-s3.yml
For this, if you have not already, run:
docker-compose down
Then open your docker-compose.yml file, find the solr:
key and replace with this new settings. If for some unknown reason your voluments do not match our defaults, please adapt to your custom edits so they match where the Solr Core and Libraries are saved:
nano docker-compose.yml
Save your changes.
Step 2: Wipe clean. Get the new configs. Get the new OCR Highlight library
Wait! (breath.)
This step is only required if you are moving from Solr 8.x to 9.x or inside 9.x you have solr field type
definition changes. If you had a stock 1.3.0 with solr 9.1 and want to move to solr 9.2 you can skip deleting everything and can jump to Step 3!
This step requires some nerve. Be sure you know where you are inside your terminal (always)
Inside your archipelago-deployment-live folder run:
cd data_storage/solrcore
pwd
You should see something like
/home/ec2-user/archipelago-deployment-live/data_storage/solrcore
Which means you are in the correct folder. Now time to clean your index (really think twice here ok? You have a backup. Never run any of these without a backup)
sudo rm -rf *
Now we need the new configurations for your Solr (so then docker container can re-create the index from scratch). Remember we downloaded a reference/empty Archipelago Deployment Live 1.4.0 at /home/ec2-user/archipelago-deployment-live-1.4.0
.
We are going to use the files there to replace your own configs. cd
back to your live deployment assuming here it is (still) /home/ec2-user/archipelago-deployment-live
cd /home/ec2-user/archipelago-deployment-live
cp -rpv /home/ec2-user/archipelago-deployment-live-1.4.0/config_storage/solrconfig/conf/* /home/ec2-user/archipelago-deployment-live/config_storage/solrconfig/conf/.
Now we need to remove the old OCR library and replace with the new one
rm /home/ec2-user/archipelago-deployment-live/data_storage/solrlib/solr-ocrhighlighting-0.7.1.jar
cp -rpv /home/ec2-user/archipelago-deployment-live-1.4.0/data_storage/solrlib/solr-ocrhighlighting-0.8.4-SNAPSHOT.jar /home/ec2-user/archipelago-deployment-live//data_storage/solrlib/.
Step 3: docker pull and check
Optional: You might want to review/compare (now) your current Drupal Search API Index against our most current one here:
In specific:
- https://github.com/esmero/archipelago-deployment-live/blob/1.4.0/drupal/config/sync/search_api.server.esmero_solr.yml
- https://github.com/esmero/archipelago-deployment-live/blob/1.4.0/drupal/config/sync/search_api.index.default_solr_index.yml
and anything that starts with search_api.
too.
Time to fetch the latest Solr that will re-create the index:
docker compose pull
docker compose up -d
Give all a little time to start. Please be patient. To ensure all is well, run (more than once if necessary) the following:
docker ps
You should see something like this if you synced all containers to the latest (your versions and databse might vary depending on your server's platform):
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
5b06ee366f58 jonasal/nginx-certbot "/docker-entrypoint.…" 10 minutes ago Up 10 minutes 0.0.0.0:8001->80/tcp esmero-web
86b685008158 solr:9.2.1 "docker-entrypoint.s…" 10 minutes ago Up 10 minutes 0.0.0.0:8983->8983/tcp esmero-solr
a4872b237e17 esmero/cantaloupe-s3:6.0.1-multiarch "sh -c 'java -Dcanta…" 10 minutes ago Up 10 minutes 0.0.0.0:8183->8182/tcp esmero-cantaloupe
bec0b31f3421 mariadb:10.6.18-focal "docker-entrypoint.s…" 10 minutes ago Up 10 minutes 3306/tcp esmero-db
85bedadf9732 redis:6.2-alpine "docker-entrypoint.s…" 10 minutes ago 10 minutes ago esmero-redis
6a9e9d8647a9 minio/minio:RELEASE.2022-06-11T19-55-32Z "/usr/bin/docker-ent…" 10 minutes ago Up 10 minutes 0.0.0.0:9000-9001->9000-9001/tcp esmero-minio
bc5327680ca7 esmero/php-8.1-fpm:1.2.0-multiarch "docker-php-entrypoi…" 10 minutes ago Up 10 minutes 9000/tcp esmero-php
d53729be1211 esmero/esmero-nlp:fasttext-multiarch "/usr/local/bin/entr…" 10 minutes ago Up 10 minutes 0.0.0.0:6400->6400/tcp esmero-nlp
Important here is the STATUS
column. It needs to be a number that goes up in time every time you run docker ps
again (and again).
Check your Solr logs
docker logs -f esmero-solr -n 100 for failure messages.
Step 4: Re-index Drupal Search API
If you decide/not decide to syncronize Drupal's Search API Server, Index and fields is personal decision (optional). We do recommend it but your Solr Index might have many customizations already, so a better/pro approach would be do diff
the .yml files and decide selectively. Once you decided/did that/skipped. Time to reindex
Run the following:
docker exec esmero-php drush search-api-reindex
docker exec esmero-php drush search-api-index
If you made it this far you are done with code/devops (are we ever ready?), and that means you should be able to (hopefully) stay in the Drupal 10.x realm for a few years!
Done!
Need help? Blue Screen? Missed a step? Need a hug or someone that listens to you in silence?
If you see any issues or errors or need help with a step, please let us know (ASAP!). You can either open an issue
in this repository or use the Google Group. We are here to help.