Обсуждение: Re: [GENERAL] index row size 2728 exceeds btree maximum, 27
To me it seems that the definer of this table missed the concept index ... or the concept database One usually looks up data using a key, but if the whole row is the key, what data shall be looked up. So short story long: Remove data from your index. The data column seems like the data to be looked up using the key (scan_id, host_ip, port_num, plugin_id, severity) or even less. Postgres is able to take several indices over distinct columns into account. Thus reducing the possible candidates to a hand full. So several indices are also an option |-----Original Message----- |From: Dinesh Pandey [mailto:dpandey@secf.com] |Sent: Donnerstag, 01. Jänner 2004 11:09 |To: 'Richard Huxton' |Cc: pgsql-general@postgresql.org; 'PostgreSQL' |Subject: Re: [SQL] [GENERAL] index row size 2728 exceeds btree maximum, |2713 | | |I am inserting some log messages in the column "data". (Basically I am |inserting records from reading an xml file) | |In the PRIMARY KEY, btree (scan_id, host_ip, port_num, |plugin_id, severity, |data) data is of type TEXT and can contain long string values. | |The question is how to remove this error "index row size 2728 |exceeds btree |maximum, 2713" by increasing the btree size? | |The big problem is "I can not add any additional column in this table." | |Thanks |Dinesh Pandey | |-----Original Message----- |From: pgsql-general-owner@postgresql.org |[mailto:pgsql-general-owner@postgresql.org] On Behalf Of Richard Huxton |Sent: Thursday, June 02, 2005 3:29 PM |To: dpandey@secf.com |Cc: pgsql-general@postgresql.org; 'PostgreSQL' |Subject: Re: [GENERAL] index row size 2728 exceeds btree maximum, 2713 | |Dinesh Pandey wrote: |> -----------+-----------------------+----------- |> Column | Type |> -----------+-----------------------+----------- |> scan_id | bigint |> host_ip | character varying(15) |> port_num | integer |> plugin_id | integer |> severity | character varying(50) |> data | text |> |> Indexes: |> "pk_scanned_port_info" PRIMARY KEY, btree (scan_id, |host_ip, port_num, |> plugin_id, severity, data) |> |> On inserting record I am getting this error "index row size |2728 exceeds |> btree maximum, 2713" | |Well - the error message is clear enough. The question is, what to do. | |Without knowing what the table "means", it's difficult to say what the |primary-key should be, but it seems unlikely to include an |unlimited-length text-field called "data". | |If the data itself doesn't offer any suitable candidate keys (as can |well be the case) then common practice is to generate a unique number |and use that as an ID - in PostgreSQL's case by use of the SERIAL |pseudo-type. | |Does that help? |-- | Richard Huxton | Archonet Ltd | |---------------------------(end of |broadcast)--------------------------- |TIP 7: don't forget to increase your free space map settings | | | |---------------------------(end of |broadcast)--------------------------- |TIP 2: you can get off all lists at once with the unregister command | (send "unregister YourEmailAddressHere" to |majordomo@postgresql.org) |
On Thu, Jun 02, 2005 at 14:08:54 +0200, KÖPFERL Robert <robert.koepferl@sonorys.at> wrote: > To me it seems that the definer of this table missed the concept index ... > or the concept database > One usually looks up data using a key, but if the whole row is the key, what > data shall be looked up. You sometimes do want to make a whole role a key to avoid duplicate keys. A common case is when you use a table to connect two other tables with a many to many relation. It would be rare to want to do that with large text values though.
KÖPFERL Robert wrote: > To me it seems that the definer of this table missed the concept index ... > or the concept database > One usually looks up data using a key, but if the whole row is the key, what > data shall be looked up. > So short story long: Remove data from your index. The data column seems > like the data to be looked up using the key > (scan_id, host_ip, port_num, plugin_id, severity) or even less. > Postgres is able to take several indices over distinct columns into account. > Thus reducing the possible candidates to a hand full. > So several indices are also an option Actually, Dinesh didn't mention he was using this for the speed of lookup. He'd defined the columns as being the PRIMARY KEY, presumably because he feels they are/should be unique. Given that they are rows from a logfile, I'm not convinced this is the case. -- Richard Huxton Archonet Ltd
On 6/2/05, Richard Huxton <dev@archonet.com> wrote: > KÖPFERL Robert wrote: > > To me it seems that the definer of this table missed the concept index ... > > or the concept database > > One usually looks up data using a key, but if the whole row is the key, what > > data shall be looked up. > > > So short story long: Remove data from your index. The data column seems > > like the data to be looked up using the key > > (scan_id, host_ip, port_num, plugin_id, severity) or even less. > > Postgres is able to take several indices over distinct columns into account. > > Thus reducing the possible candidates to a hand full. > > So several indices are also an option > > Actually, Dinesh didn't mention he was using this for the speed of > lookup. He'd defined the columns as being the PRIMARY KEY, presumably > because he feels they are/should be unique. Given that they are rows > from a logfile, I'm not convinced this is the case. > If this a log he will need a timestamp field to be usefull, making that field part of the primary key and letting the data out of the primary has more sense to me. -- regards, Jaime Casanova (DBA: DataBase Aniquilator ;)
On Thu, Jun 02, 2005 at 13:40:53 +0100, Richard Huxton <dev@archonet.com> wrote: > > Actually, Dinesh didn't mention he was using this for the speed of > lookup. He'd defined the columns as being the PRIMARY KEY, presumably > because he feels they are/should be unique. Given that they are rows > from a logfile, I'm not convinced this is the case. Even for case you could still use hashes. The odds of a false collision using SHA-1 are so small that some sort of disaster is more likely. Another possibility is if there are a fixed number of possible messages, is that they could be entered in their own table with a serail PK and the other table could reference the PK.
Bruno Wolff III wrote: > On Thu, Jun 02, 2005 at 13:40:53 +0100, > Richard Huxton <dev@archonet.com> wrote: > >>Actually, Dinesh didn't mention he was using this for the speed of >>lookup. He'd defined the columns as being the PRIMARY KEY, presumably >>because he feels they are/should be unique. Given that they are rows >>from a logfile, I'm not convinced this is the case. > > > Even for case you could still use hashes. The odds of a false collision > using SHA-1 are so small that some sort of disaster is more likely. > Another possibility is if there are a fixed number of possible messages, > is that they could be entered in their own table with a serail PK and > the other table could reference the PK. Certainly, but if the text in the logfile row is the same, then hashing isn't going to make a blind bit of difference. That's the root of my concern, and something only Dinesh knows. -- Richard Huxton Archonet Ltd
On Thu, Jun 02, 2005 at 18:00:17 +0100, Richard Huxton <dev@archonet.com> wrote: > > Certainly, but if the text in the logfile row is the same, then hashing > isn't going to make a blind bit of difference. That's the root of my > concern, and something only Dinesh knows. Sure it is. Because the hash can be used in the primary key instead of of the error message which should reduce the size of the key enough that he can use a btree index.
Bruno Wolff III wrote: > On Thu, Jun 02, 2005 at 18:00:17 +0100, > Richard Huxton <dev@archonet.com> wrote: > >>Certainly, but if the text in the logfile row is the same, then hashing >>isn't going to make a blind bit of difference. That's the root of my >>concern, and something only Dinesh knows. > > > Sure it is. Because the hash can be used in the primary key instead of > of the error message which should reduce the size of the key enough > that he can use a btree index. Sorry - obviously not being clear. Since he's using the index via a primary key he'll need the columns that key is over to be unique. If the columns fail that test in the real world, hashing will replace the index-size error with an "unable to insert duplicates" error. -- Richard Huxton Archonet Ltd