=^.^=

Importing CSV Data to MySQL (Microsoft Access, Excel, etc.)

karma

I have the unenviable task of migrating some Microsoft Office Access tables into MySQL. The first step, assuming you don't want to go through the pain of setting up a MySQL ODBC connection is to export these tables to Excel then use Excel to export the spreadsheets to CSV format. One might think one would be able to export to CSV directly from Access but one might think a lot of things about Microsoft products.

You'll need to create your table structure in MySQL based on the header row of the CSV and type of data in the columns. Don't create any auto-incrementing ID index or any of the extra columns you want to add yet; this will have to be done after the import since MySQL will treat the columns in your CSV on a FIFO basis. Be sure to structure your table accordingly.

Once your table has been prepared edit the CSV and delete the column header row, any blank or invalid lines and check the cells for formatting. If you are trying to import prices to a FLOAT column and they include currency symbols, spaces or delimiters you are going to have problems if you don't take this into account first. Excel wraps cells containing commas (or whatever cell delimiter you specified) in double quotes; MySQL will read these commas as delimiters anyway so if you need them included in your dataset you might need to choose a more appropriate delimiter instead (the pipe | for example). These double quotes will also negatively impact cells of a type which do not support them (INT, FLOAT) and be included in those that do (VARCHAR, TEXT) so it is advisable to run a search-and-delete prior to importing.

Once your CSV passes inspection upload it to your SQL server and after logging in to the MySQL CLI and selecting the appropriate database:

mysql> load data local infile 'myfile.csv' into table `mytable` fields terminated by ',' lines terminated by '\n';
Query OK, 2189 rows affected, 85 warnings (0.01 sec)
Records: 2189  Deleted: 0  Skipped: 0  Warnings: 84

Replace ',' with an appropriate delimiter if you are not using a true CSV, i.e. '\t' for TSV (tab separated values).

If this has produced warnings you're going to have to TRUNCATE TABLE, go back to your CSV and inspect the formatting again. Run

mysql> show warnings;
+---------+------+------------------------------------------------------------------------------+
| Level   | Code | Message                                                                      |
+---------+------+------------------------------------------------------------------------------+
| Warning | 1262 | Row 487 was truncated; it contained more data than there were input columns  |
| Warning | 1265 | Data truncated for column 'thing1' at row 531                                |
| Warning | 1265 | Data truncated for column 'thing2' at row 531                                |

...

to see what went wrong and identify the line number(s) requiring modification.

XCache: My PHP Opcode Cache and Datastore in Shining Armour

karma

I recently mentioned using APC with threaded Apache and PHP was incredibly unstable. For a few days I moved back to the prefork (process-based) Apache MPM to see if it was worth bringing back APC. This eventually ended up bringing the load average on the SQL VM to about 40 with what I would normally consider an appropriate number of MaxClients and the odd spike at about 10 when restricted to 128. Page load time was also noticeably increased so I decided to give XCache a whirl (thanks to LiteStar mentioning it in the comments of that article).

XCache is a stable, and (so far it seems) genuinely thread-safe opcode cache and datastore. It is developed my mOo, a lighttpd developer and hosted under the lighttpd domain. Depsite this, it is built as a PHP extension in the same manner as APC meaning it can be used with mod_php, fastcgi etc. Like APC, XCache features shared memory variable storage functions with a simple name, value scheme in addition to the opcode cache. I was able to quickly write an SQL wrapper using these that reduced the database hits in one of my apps by a factor of about 10,000 - making virtually the whole thing run out of RAM.

Also similarly to APC, XCache comes with bundled admin scripts which - while perhaps not as pretty as APC's - I certainly find more insightful. Some folks find it faster, but I find it works so it looks like I'll be sold for some time.

Cheers, Litey. And cheers, mOo.

UPDATE Or not... After a few weeks of faithful service xcache finally segfaulted on me:

apache2[17395]: segfault at 332e352e ip b5ba90f7 sp a4b4cb70 error 4 in xcache.so[b5b96000+1d000]
apache2[16754]: segfault at 4320526d ip b5ba4297 sp aab5ad60 error 4 in xcache.so[b5b96000+1d000]
apache2[27959]: segfault at 43205445 ip b5ba90f7 sp aab58b70 error 4 in xcache.so[b5b96000+1d000]
apache2[15215]: segfault at 2 ip b5ba90f7 sp a534db70 error 4 in xcache.so[b5b96000+1d000]
apache2[15536]: segfault at 362e3335 ip b5ba90f7 sp ac35bb70 error 4 in xcache.so[b5b96000+1d000]

I'm doing my best at this moment to reproduce the suituation.

UPDATE It turns out this might actually be a bug in PHP. mOo was kind enough to respond in detail:

A PHP 5.3 bug No.62432 ReflectionMethod random corrupt memory on high
concurrent was fixed that cause PHP itself unstable, bogus reported as
XCache bug. It is false positive to reproduce with XCache
loaded/enabled only, just because XCache make PHP run faster than
with/without other opcode cacher. PHP up to 5.3.14 is affected. It is
highly suggested to upgrade to newer PHP
https://bugs.php.net/bug.php?id=62432

so, upgrade to 5.4 if applicable. upgarde to
http://xcache.lighttpd.net/wiki/Release-2.0.1 if you still have
problem with PHP 5.4 + XCache 1.3.2

Which I actually realized when reading this while he was making his reply https://groups.google.com/forum/?fromgroups#!topic/xcache/pZHjUu3Dq3k:

PHP 5.3.14 is unstable, Please upgrade to new version. You have been warned I've tried hard to make it stable even in 2.0 big jump, yet anything that can go wrong goes wrong.

Just goes to show I need to do more reading before jumping the gun :p

PHP with Calls to ffmpeg Functions Terminate Early

karma

If your PHP scripts die suddenly on calls to ffmpeg functions you may have:
- Built a new version of PHP without recompiling ffmpeg-php against it
- Switched to or from thread safe PHP

The solution in both cases is:

# emerge --update --newuse ffmpeg ffmpeg-php

mpm_worker and Thread-Safe PHP are Toxic to APC

karma

Technically the reverse is true but with all the problems folks seem to be having with the threaded Apache MPM mpm_worker and the supposedly "thread safe" versions of PHP one wonders - though it is said PHP itself is thread safe but many of its add-ons (apparently APC among them) are not. I can testify to that because I've been running a high volume site with threads for a few months with no issues except the big one: APC works for about 5 minutes then apache starts churning out blank pages. PHP appears to segfault but who knows what's really going on when absolutely nothing shows up in any logs.

Supposedly mpm_worker offers great memory savings but I've failed to see it nor the benefits it brings and I've also failed to see any real difference in performance - though you could chalk that up to my having lost the bytecode cache. I've been back on mpm_prefork for a few hours now and haven't noticed any significant change in load, though I'll know by tomorrow if (apparently now stable) APC is doing anything for me.

So far it's looking like threaded apache at this stage only really has a place on ram-constrained VPSes but as far as stability and interoperability is concerned if you got it, use it baby.

UPDATE Looks like load is actually about half what it used to be at peak hours. Threading is definitely not worth losing APC, at least for me.

/bin/rm: Argument list too long

karma

You may encounter this error when deleting all the files in a given directory:

/bin/rm: Argument list too long

This happens because you can only pass 128K of command line and environment data; doing an rm * first fills that buffer with all of the filenames in the given directory.

The solution is to feed the contents of the directory to rm one-by-one:

find ./path/ -print0 | xargs -0 rm