UESPWiki:Upgrade History/2011

The UESPWiki – Your source for The Elder Scrolls since 1995
Jump to: navigation, search
Semi Protection
This is an archive of past UESPWiki:Upgrade History discussions. Do not edit the contents of this page, except for maintenance such as updating links.

23 December 2011

  • Installed X-Vary-Options Squid patch on squid2.
  • Redirected main site DNS back to squid2.
  • Update: After a day the hit rate on squid2 is averaging ~90%.

22 December 2011

  • Setup and added squid1 (currently being unused) as the third content server. Addition of the tracking JS code reduced the Squid cache hit rate from 85% to 50% which has caused additional load on the content servers resulting in a few random load spikes.
  • Setup, compiled and tested the X-Vary-Options patch for Squid on squid1.
  • Stopped Apache on squid1 and removed it from the content balancing.
  • Pointed main site DNS from squid2 to squid1.

20 December 2011

  • Added Google Analytics and Comscore JavaScript to wiki pages for statistic counting.

13 December 2011

  • Switched all applications to use db2 instead of db1 for maintenance on db1.
  • Ran mysqltuner.pl on db2
-------- General Statistics --------------------------------------------------
[--] Skipped version check for MySQLTuner script
[OK] Currently running supported MySQL version 5.0.77-log
[OK] Operating on 64-bit architecture

-------- Storage Engine Statistics -------------------------------------------
[--] Status: -Archive -BDB -Federated +InnoDB -ISAM -NDBCluster
[--] Data in MyISAM tables: 8G (Tables: 150)
[--] Data in InnoDB tables: 263M (Tables: 87)
[--] Data in MEMORY tables: 1020K (Tables: 4)
[!!] Total fragmented tables: 22

-------- Security Recommendations  -------------------------------------------
[OK] All database users have passwords assigned

-------- Performance Metrics -------------------------------------------------
[--] Up for: 10d 14h 57m 51s (181M q [197.794 qps], 15M conn, TX: 165B, RX: 29B)

[--] Reads / Writes: 95% / 5%
[--] Total buffers: 2.5G global + 2.6M per thread (1000 max threads)
[OK] Maximum possible memory usage: 5.1G (66% of installed RAM)
[OK] Slow queries: 0% (78K/181M)
[OK] Highest usage of available connections: 40% (404/1000)
[OK] Key buffer size / total MyISAM indexes: 1.0G/695.0M
[OK] Key buffer hit rate: 100.0% (23B cached / 1M reads)
[OK] Query cache efficiency: 29.7% (39M cached / 133M selects)
[!!] Query cache prunes per day: 57088
[OK] Sorts requiring temporary tables: 0% (1K temp sorts / 6M sorts)
[!!] Temporary tables created on disk: 32% (1M on disk / 3M total)
[OK] Thread cache hit rate: 99% (404 created / 15M connections)
[OK] Table cache hit rate: 37% (557 open / 1K opened)
[OK] Open file limit used: 1% (592/31K)
[OK] Table locks acquired immediately: 99% (124M immediate / 124M locks)
[OK] InnoDB data size / buffer pool: 263.2M/400.0M

-------- Recommendations -----------------------------------------------------
General recommendations:
    Run OPTIMIZE TABLE to defragment tables for better performance
    Increasing the query_cache size over 128M may reduce performance
    When making adjustments, make tmp_table_size/max_heap_table_size equal
    Reduce your SELECT DISTINCT queries without LIMIT clauses
Variables to adjust:
    query_cache_size (> 1G) [see warning above]
    tmp_table_size (> 100M)
    max_heap_table_size (> 50M)

12 December 2011

  • Uploaded revised fonts for DragonscriptRegular to content1/2/3

7 December 2011

  • Uploaded fonts for DragonscriptRegular to extensions/DragonFont on content1/2/3
  • Switched back to skins.uesp.net for stylePath on content1/2 to address problems caused by non-standard port 81 in skins2.uesp.net:81 (see here).

29 November 2011

  • Received new db2 server within cluster. Standard setup with MySQL.
  • Started replication on db2 from db1.
  • Switched content1/2/3 to use db2 for all reads (db1 for all writes still).
  • Ran the mysqltuner.pl on db1
  • Removed a few localhost users with no passwords set
  • Added skip-bdb
  • Increased user file limit from 8192 to 100000.
  • Increased table_cache from 3000 to 15000.
  • Increased thread_cache from 100 to 1000.
  • Increased innodb_buffer_pool_size from 100 to 400MB.
  • Reduced sort_buffer_size, read_buffer_size
  • Switched content1/2/3 back to using db1/content3 so maintenance can be done on the db2 databases/mysql.
  • Optimized fragmented tables on db2.
  • Adjusted settings and restarted mysql on db2.
  • Restored content1/2/3 to use just db2 for reads.

27 November 2011

  • Small tweak to MetaTemplate on content1/2/3 to fix bug with empty headers not being removed
  • Another MetaTemplate tweak on content1/2/3 to circumvent an SQL query that was triggering out-of-memory errors

23 November 2011

  • Changed wgSessionsInMemcached to true on content1/2/3 to see if that will reduce the very high write load on files1 (1000 writes/sec compared to only 100 reads/sec). Note that this will probably force all logged in users to re-login. This appeared to have no effect in the 10 minutes of monitoring iostat following the change.
  • Added the noatime option to /dev/md2 on files1 and remounted the main partition. Disk writes immediately dropped to 1-100 writes/sec.

19 November 2011

  • Received and setup squid2 (copy of squid1 setup).
  • Switched the main domains www.uesp.net and uesp.net to point to squid2. This should hopefully minimize the firewall connection issue as squid2 is outside of the cluster setup.
  • Skin files on content1/2 are being served from skins2.uesp.net:81 which is just the new squid2 server. This is to further reduce the connection load on the firewall.
  • content2/3 have been setup to use the backup database on content3 as the "read-only" database to distribute the load. Seeing what effect this has before also changing content1.
  • Updated permissions for wikiuser on db1/content3 to permit use of the MediaWiki database load balancing settings.
  • content1/2 setup to use content3 for some reads. Load on db1/content3 being monitored and adjusting as needed.

9 November 2011

  • Deleted entries from the wiki's "user_newtalk" table for anonymous IPs whose talk pages were last edited in 2010 or earlier. I also removed entries for a handful of problem IPs where the only recent edits were other editors getting false 'new message' warnings. See UESPWiki talk:Purge Requests#Status Update.
  • Confirmed that smartd e-mail messaging was working on all six servers.
  • Replaced e-mail alias in mdadm.conf with a single e-mail, restarted mdmonitor and tested on db1 and files1.

7 November 2011

  • Tweaked the code for the TitleBlacklist extension to add a new parameter "nonewaccount" (changes made to both TitleBlacklist.list.php and TitleBlacklist.hooks.php)
  • Added admin e-mail addresses to /etc/smartd.conf on all servers and restarted smartd.
  • Added admin e-mail addresses to /etc/mdadm.conf using an e-mail alias mdalert on files1 and db1 and restarted mdmonitor.

6 November 2011

  • Gave "suppressredirect" right to sysops
  • Gave "tboverride" right to bots, patrollers, userpatrollers
  • Added DS and BK namespace abbreviations for Dawnstar and Book, respectively
  • Updated search preferences for existing users to add Skyrim to namespaces searched by default (also added Lore, SI for users who hadn't ever set up any preference for those namespaces).

4 November 2011

  • Increased session.gc_maxlifetime from 1440 to 100000 on content1/2.

1 November 2011

  • Increased MySQL's query_cache_size from 256M to 472M (could not set it much higher or it would reset to 0) on db1.
  • Made mobile.uesp.net server from its own MediaWiki directory on content3 and changed the default skin to WPTouch.

31 October 2011

  • Added mobile.uesp.net to the DNS which currently points to content3.uesp.net.
  • Changed query_cache_min_res_unit on db1 from 4096 to 1024 to try and prevent the cache from running out of free blocks (qcache_lowmem_prunes was high, 400k/day).

19 October 2011

  • Turned the MySQL log on in my.cnf on db1.
  • Added skip-name-resolve to my.cnf on db1. This should stop the occasional site outages/delays we've been seeing since the summer.
  • Restarted MySQL on db1.

5 October 2011

  • Changed a few the e-mail addresses on the wiki and forums:
  • Wiki password e-mails are sent from password@uesp.net
  • Main contact e-mail changed to uespnet@gmail.com
  • Forum e-mails are sent from forums@uesp.net
  • Added the Tes5Mod namespace to all content servers.

21 August 2011

  • Added the skipcaptcha permission to the sysop and blockers groups on all content servers.

16 August 2011

  • Repaired the searchindex table on content3 and restarted database replication.

29 July 2011

  • Setup automatic backups of all server's configuration files.
  • Checked that backups on db1, content3, and backup1 were operating correctly. Fixed a few missing weekly backup rotations and file syncs on content3/backup1.
  • Upgraded PHP on content3 to 5.2.16 (testing for future MediaWiki upgrade). Procedure is:
    [utterramblings]
    name=Jason's Utter Ramblings Repo
    baseurl=http://www.jasonlitka.com/media/EL$releasever/$basearch/
    enabled=1
    gpgcheck=1
    gpgkey=http://www.jasonlitka.com/media/RPM-GPG-KEY-jlitka
  • Upgrade PHP via yum upgrade php. Note that this will upgrade a variety of related packages, including mysql, so ensure no vital services are affected before continuing.
  • yum install php-eaccelerator
  • yum install php-mcrypt
  • yum erase php-pecl-memcache
  • yum install php-memcache
  • Recompile and install the wikidiff2 extension: make clean, make, make install
  • Restart Apache and check the server's phpinfo.php page to ensure a successful upgrade.

23 July 2011

  • Installed the Oversight extension.
  • Changed the 'move' permission from users to auto-confirmed users.

26 April 2011

  • Search on content3 now uses Sphinx (modified to enable logging of queries).
  • All searches on content1/2 ignore the user's Search Titles Only preference and always include text matches due to the large number of empty search results being returned (more than 1/2 of all searches).

25 April 2011

  • Updated the UespCustomCode extension on content1/2/3 to fix a search related bug.
  • Updated the SearchLog extension to properly handle NULL inputs in the search hooks.
  • Added the "Books:" namespace to content1/2/3.
  • Installed the Sphinx search engine on content3 for testing purposes (see http://content3.uesp.net/wiki/Special:SphinxSearch Special:SphinxSearch only on content3).

24 April 2011

  • Added initial version of the Special:SearchLog extension to content1/2/3.
  • Cleared searchlog tables to remove entries with uppercase characters in them.
  • Updated searchlog extension on content1/2 from the development version on content3.

18 April 2011

  • Added "protect" to the "bot" group permissions on content1/2/3.

26 March 2011

  • Upgraded the UESP forums from phpBB3 3.0.5 to 3.0.8.

9 March 2011

  • Removed the UespSiteStats and UespPayPal extensions from content1/2/3.

5 March 2011

  • Restored syncing of static files on backup1.
  • Squid cache continually restarting. Changed maxfullbufs in squid.conf on squid1 from 4 to 64.

2 March 2011

  • Added Lighttpd stats to Zabbix via a custom client side script.
  • Updated Zabbix server front end on monitor.uesp.net.

1 March 2011

  • Started replication of db1 from backup1.
  • Added Apache monitoring stats in Zabbix via a custom client side script.

28 February 2011

  • Started replication of db1 from content3.

21 February 2011

  • Switched to the new squid1 server.
  • Fixed error in /etc/init.d/zabbix_agentd script on content1/2/3, files1 and restarted the Zabbix agent daemon.
  • Increased max_filedesc in Squid configuration from 1024 to 4096.
  • Setup and started Zabbix server on content3. Re-enabled monitoring of new servers (except db1).
  • Added monitoring of basic Squid stats through Zabbix.
  • Fixed custom Zabbix monitoring scripts on content3 (MySQL) and files1 (memcached).
  • Modified Squid settings on squid1 to try and let it use more RAM (only 50MB in use):
  • Changed cache_mem on squid1 from 8 to 500 MB.
  • Changed maximum_object_size_in_memory from 8 to 100KB.

19 February 2011

  • Re-enabled Squid on the new content2 Wiki configuration.
  • Re-enabled Wiki memcached settings on new content2 server.
  • Disabled Squid cache logging on squid1 and restarted due to very high IOWait loads (50-70%). IO load dropped to normal levels of 10-20% after half an hour.
  • Increased maximum cache size from 100 to 350GB on squid1 and restarted.
  • Corrected directory permissions on /home/blog on the new content1.

17 February 2011

  • Migrated to the new content2 server.
  • Looking at the error logs in the new content2 server:
  • mod_proxy was not correctly enabled...fixed
  • Improper "Options" variable set in Apache...fixed
  • The UespCustomCode extension has some undefined variable errors occuring in SiteSpecialSearch.php. This appears to be due to a class constructor/initialization function misorder (the constructor is running before the initialization function in SiteCustomCode.php). Fixed by adding a check for uninitialized variables and using fixed default values.

16 February 2011

  • Re-enabled compression on the new files1 server by restoring the default etags configuration settings.

15 February 2011

  • Migrated to the new files1 server.
  • Lighttpd compression on the new files1 disabled due to issues serving compressed files (403 forbidden errors and incorrect mime types).

14 February 2011

  • Remounted the wikiimages directory on content1.

13 February 2011

  • The nfs lock manager on content1 was not running which caused pages to "randomly" take a long time to load (30 seconds to several minutes). I couldn't get rpc.lockd or the nfs services to restart and ended up having to reboot the server which fixed the issue. Undoubtedly, my playing with NFS on the server yesterday caused the issue.

10 February 2011

  • Content1 became unavailable/inaccessible and had to be remotely rebooted. Total downtime was roughly 10 minutes. This should not be related to the recent server upgrade. The Zabbix agent on content1 has been reporting incorrect stats/events for a while (for example, low disk space warnings despite having 200GB free).

7 February 2011

  • Started setting up the new files1 server. Everything went well except the new NFS configuration which times out when trying to mount a share.

4 February 2011

  • Restarted the Zabbix agent on content1. It was intermittently incorrectly reporting content1 as "unreachable".
  • Received the 6 new servers from iWeb.


Prev: 2010 Up: Upgrade History Next: 2012