FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > CentOS > CentOS

 
 
LinkBack Thread Tools
 
Old 11-06-2008, 07:44 AM
"Mindaugas Riauba"
 
Default Painfully slow NetApp with databas

Hello,

We have long running problem with NetApp filers. When we connect
server to the filer sequential read performance is ~70MB/s. But once
we run database on the server seq read performance drops to ~11MB/s.
That's happening with two servers. One is running Oracle another -
MySQL. During speed tests database load is very light (less than 1MB/s
of reads and writes). During the tests NetApp was serving only these
two servers.

What could be causing such slowdown and overall slow read
performance? Writes are fast >100MB/s.

NetApp support is claiming that such performance is normal. Somehow
I do not believe that 2007 model should deliver such XXth century
performance levels.

RHEL4 fully updated 64 bits connected to NetApp FAS3040 via SAN
using Qlogic 2Gb/s FC adapters. Tried both drivers from Redhat and
QLogic (for NetApp). Also changing of FC parameters (like queue size)
did not help.

Regards,

Mindaugas
_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos
 
Old 11-06-2008, 11:46 AM
"nate"
 
Default Painfully slow NetApp with databas

Mindaugas Riauba wrote:

> NetApp support is claiming that such performance is normal. Somehow
> I do not believe that 2007 model should deliver such XXth century
> performance levels.

How many disks, what RPM are they running at, what I/O block size
is being used and what protocol (NFS/iSCSI/FC) is being used?

Checking an Oracle DB I used to run it averages 7k I/Os with
spikes to 59k. For a 10k RPM disk, 7k I/O size means roughly
800kBytes/second before latency starts to become an issue depending
on the controller type. Really high end controllers can
go up to about 1,312kB instead of 800kB. The array reports
Oracle is using an average of 985 kBytes/second with spikes to
28MBytes/second.

A MySQL DB I used to run averages 41k I/Os with spikes up to
333k. For a 10k RPM disk 41k I/O is 4500 kBytes/second. The
array reports MySQL using an average of 3200 kBytes/second with
spikes to 34.1MBytes/second.

The array throughput numbers include benefit from the disk
cache, while the raw spindle performance assumes no cache
(or "worst case" performance).

Both of those are connected via Fiber channel so performance will
be quite a bit higher than that of NFS or iSCSI.

So the numbers your seeing could be perfectly reasonable as
NetApp suggests depending on what the exact workload is and
your array configuration. For the workload you should look
to the array for statistics, I'm not too familiar with NetApp
arrays but I assume they offer a wide range of statistics,
hopefully I/O size is among them as that is the most critical
to determine throughput.

The array running the above databases is a system running 40
10k RPM disks with the data evenly distributed across all
spindles for max performance. The array also is host to about
25 other systems as well.

NetApps certainly aren't the fastest thing in the west but
given your performance levels it sounds like you don't have
many disks connected and are limited by the disks rather
than the controller(s).

Most low end arrays don't offer the level of visibility that
the enterprise ones do.

On that note I'm getting a new 150TB array in today, pretty
excited about that. 3PAR T400 virtualized storage system.

nate


_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos
 
Old 11-06-2008, 08:32 PM
"David Ross"
 
Default Painfully slow NetApp with databas

> -----Original Message-----
> From: centos-bounces@centos.org [mailto:centos-bounces@centos.org]On
> Behalf Of Mindaugas Riauba
> Sent: Thursday, November 06, 2008 3:45 AM
> To: nahant-list@redhat.com; CentOS@centos.org
> Subject: [CentOS] Painfully slow NetApp with databas
>
> We have long running problem with NetApp filers. When we connect
> server to the filer sequential read performance is ~70MB/s. But once
> we run database on the server seq read performance drops to ~11MB/s.

There many factors which impact database i/o performance. You are comparing two probably dissimilar i/o paths. Configuration differences (disk, controller, connection, kernel parameters) may be part of the problem.

Physical vs. logical sequence, i.e. disk head movement, is particularly significant for sequential access. Throughput increases of 200% to 500% are not uncommon after packing (dump/restore) of a large table.

The database translates and theoretically optimizes sql statements generating query plans. They should be examined to determine if there are doing what is expected.

> That's happening with two servers. One is running Oracle another -
> MySQL. During speed tests database load is very light (less than 1MB/s
> of reads and writes). During the tests NetApp was serving only these
> two servers.
>
> What could be causing such slowdown and overall slow read
> performance? Writes are fast >100MB/s.
>

Database performance data is another good source of diagnostic info. It can help pinpoint query plan problems.

This article may help you put all the pieces together. It is one of the best I have found:

http://www.miracleas.com/BAARF/oow2000_same.pdf

The author is Juan Loaitza from Oracle.


Hope some of this helps.

david



_______________________________________
No viruses found in this outgoing message
Scanned by iolo AntiVirus 1.5.5.5
http://www.iolo.com

_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos
 
Old 11-06-2008, 10:15 PM
"William L. Maltby"
 
Default Painfully slow NetApp with databas

On Thu, 2008-11-06 at 16:32 -0500, David Ross wrote:
> <snip>

> There many factors which impact database i/o performance. You are comparing two probably dissimilar i/o paths. Configuration differences (disk, controller, connection, kernel parameters) may be part of the problem.
>
> Physical vs. logical sequence, i.e. disk head movement, is particularly significant for sequential access. Throughput increases of 200% to 500% are not uncommon after packing (dump/restore) of a large table.
>
> The database translates and theoretically optimizes sql statements generating query plans. They should be examined to determine if there are doing what is expected.
>
> > That's happening with two servers. One is running Oracle another -
> > MySQL. During speed tests database load is very light (less than 1MB/s
> > of reads and writes). During the tests NetApp was serving only these
> > two servers.
> >
> > What could be causing such slowdown and overall slow read
> > performance? Writes are fast >100MB/s.
> >
>
> Database performance data is another good source of diagnostic info. It can help pinpoint query plan problems.
>
> This article may help you put all the pieces together. It is one of the best I have found:
>
> http://www.miracleas.com/BAARF/oow2000_same.pdf
>
> The author is Juan Loaitza from Oracle.
>
>
> Hope some of this helps.

I don't know if what I'm about to add is too dated. But here goes. I'm
sure much still applies because the basic issues and concepts in play
remain unchanged.

What I'm unsure of is if any of this might be related to the problem
seen.

Back in the early days of SQL databases, a lot of folks had no
understanding of the logical/physical interactions. I helped many folks
improve performance (during my contracting/consulting days) just by
examination of these two components in conjunction with access patterns.

The critical components were the most frequent access/update patterns
versus the physical organization of the primary table(s) and their
_primary_ indexes, or hashes. Do DBMs still use those things?
Regardless, even with hash tables, these relationships still are
important. Also, selection of proper hashing algorithms was important.

Further, keeping the underlying DBs (the physical file(s) organization)
physical sequence optimal for the "critical" access profiles was often
overlooked. As updates are done, especially add/delete of entries, the
physical organization quickly deteriorates with regard to the "critical"
access profile(s). It used to be that "holes" would be left in the
primary data space, resulting in constantly growing files. IIRC, most
major DBMs have addressed this by tracking and re-using these areas.
However, that tends to further "corrode" the physical sequence from that
desired. Over time, it may be very noticable.

On a very active heavily loaded DB, this can occur quite quickly. The
immediate effects could be ameliorated a bit by having sufficient
"overflow" areas - "empty spaces" at the end that received new records
when using indexes of hash tables. During reorganization the overflow
contents would be merged into the appropriate areas and the "overflow"
was again pristine.

Periodic "reorganization", a.k.a "reload" can be used to keep
performance within acceptable limits.

If the problem system has not been properly organized for the activity
profiles encountered, or if it has had lots of updates without being
"reorganized", maybe this causes the problem being seen.

In later implementations, these ill effects are somewhat reduced by
making transaction queues that attempt to optimize the sequence of
activities. However, if not running on a "raw" partition, where the DBM
knows the physical location of items, this may not be as effective as
possible.

Further, with the advent of these added "in memory" operations, there is
a risk of "thrashing" the VM. With heavy use of buffering, caching, etc.
even the fantastic speed of today's equipment may suffer enough slowdown
to be noticeable, apart from the physical disk activities. This is
exacerbated by problems with the physical organization issues mentioned
above.

I can't say if any of this is your problem or not. But barring discovery
of any system/physical problems, I suspect this might be useful.

On a last note, I'll give an example of potential gains. Keep in mind
this was long ago and far away (in my mind) on much slower equipment.
The application was written in C (a fair amount by myself) and so I know
the code was fast. Speed has always been my thing. A fellow with
relatively little experience was in charge of the database(s).

My inquiry application, fairly complex but I knew to be code-efficient,
was taking over two minutes (about 2:30 minutes IIRC) to fill a screen
(a text sceeen at that). The other _much_ simpler applications ran much
faster. But those applications were "peripheral" to the main task, for
which I was responsible.

I entered discussion with the database person, educated him a bit on
what I though was happening, and suggested some changes based on those
items I discussed above. He made the changes and we ran some tests.

All the "peripheral" applications ran just as faster of faster.

Over the next several days of hitting it every way we could think of, as
hard as we could, the _maximum_ wait was 9 seconds. The median came in
at 6-7 seconds.

I don't know what capabilities are provided to the administrators these
days, but if access to tools that give sufficient information and
control are available, maybe there is a (partial?) solution hidden in
there?

>
> david
> <snip sig stuff>

HTH
--
Bill

_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos
 
Old 11-06-2008, 11:33 PM
"John"
 
Default Painfully slow NetApp with databas

>>The critical components were the most frequent access/update patterns
>>versus the physical organization of the primary table(s) and their
>>_primary_ indexes, or hashes. Do DBMs still use those things?

Prmary Indexes have a lot more to do with I/O than most people think. You
really have to understand how a database stores the physical data on rows
and columns. The biggest index killer is Clustered Indexes. Among other
things those type of indexes take longer to execute. Why? Because the
database takes and reorders all the data for Index 1 - Index Infinite.
Regular indexes (standard) do not do so. Also Heape Indexes well, just so to
speak slap the data to screen and just don't care what the order it is. Can
add in Foreign Keys Relations also.

This can be verified by digging way deep into your db documentation. Then
you can actually exec queries to pull the raw data Pages from the database.

Bill, you're the first one I have herd say this in some time.

Among other things is the DB running in memory? Is the DB making temp tables
in memory and then dropping them.

_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos
 

Thread Tools




All times are GMT. The time now is 09:49 PM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright 2007 - 2008, www.linux-archive.org