Обсуждение: random access - bytea

Поиск
Список
Период
Сортировка

random access - bytea

От
Dennis Bjorklund
Дата:
There have been (in the past at least) plans on adding a random access
interface to bytea fields. I don't find it in the todo, maybe it should be
added?

What kind of syntax have people thought about, to let the client read a
bytea field in a random access way? I'm thinking of something like:

BEGIN;

SELECT id, open_bytea(bytea_field, READ_ACCESS) FROM table;

... and then the client has to access the data in the same transaction,
using the ID returned by open_bytea() ...

COMMIT;

Would that be feasible? I think it looks like a clean solution. To me 
it seems that accessing the data outside a transaction would not be 
something we want anyway, right?

Sematically it's easier to implement read access then write access. For
write access to work with transactions one need for example the blocks in
a bytea field to be stored with the transaction id's an stuff (I don't
know these parts of pg internals well enough yet).

One other thing I don't know well enough yet is the new wire protocol. To 
support something like above, would it need to be changed? I guess so if 
it wasn't designed with this in mind.

-- 
/Dennis



Re: random access - bytea

От
Joe Conway
Дата:
Dennis Bjorklund wrote:
> There have been (in the past at least) plans on adding a random access
> interface to bytea fields. I don't find it in the todo, maybe it should be
> added?

First do this:
ALTER TABLE foo ALTER COLUMN foo_bytea SET STORAGE EXTERNAL;

see:
http://www.postgresql.org/docs/view.php?version=7.3&idoc=0&file=sql-altertable.html

Then do:
UPDATE foo SET foo_bytea = foo_bytea || '';

That will modify the bytea column so that it is stored uncompressed in 
the TOAST table.

Now, simply use substr() to grab any random chunk:
SELECT substr(foo_bytea, 20000, 100) from foo where foo_id = 42;

When TOASTed columns are stored uncompressed, substring will grab just 
the needed chunks from the TOAST table directly. In contrast, when the 
field is stored compressed, it grabs the entire thing, uncompresses it, 
then gets the piece you asked for.

HTH,

Joe



Re: random access - bytea

От
Dennis Bjorklund
Дата:
On Sat, 25 Oct 2003, Joe Conway wrote:

> That will modify the bytea column so that it is stored uncompressed in 
> the TOAST table.
> 
> Now, simply use substr() to grab any random chunk:
> SELECT substr(foo_bytea, 20000, 100) from foo where foo_id = 42;

This would imply that every little read would have to do a scan on a table 
to find the row and then to perform the substr. An open command can 
be optimized a lot more, for example to cache entries that have been 
opened so that it's fast to read the next 1kb or whatever you want.

Also, the above does not solve writes at all which can also be made 
transaction safe and fast with a better api where you can update a part 
of a field.

The above is not really a replacement of the current lo-objects.

-- 
/Dennis



Re: random access - bytea

От
Christopher Kings-Lynne
Дата:
> This would imply that every little read would have to do a scan on a table 
> to find the row and then to perform the substr. An open command can 
> be optimized a lot more, for example to cache entries that have been 
> opened so that it's fast to read the next 1kb or whatever you want.

It's an index scan, so it's pretty fast...

Chris




Re: random access - bytea

От
Dennis Bjorklund
Дата:
On Sun, 26 Oct 2003, Christopher Kings-Lynne wrote:

> > to find the row and then to perform the substr. An open command can 
> > be optimized a lot more, for example to cache entries that have been 
> > opened so that it's fast to read the next 1kb or whatever you want.
> 
> It's an index scan, so it's pretty fast...

And if you read a lot it will be cached, but it's still a little extra
overhead and the write case it does not solve at all.

-- 
/Dennis



Re: random access - bytea

От
Hannu Krosing
Дата:
Dennis Bjorklund kirjutas P, 26.10.2003 kell 07:30:
> On Sat, 25 Oct 2003, Joe Conway wrote:
> 
> > That will modify the bytea column so that it is stored uncompressed in 
> > the TOAST table.
> > 
> > Now, simply use substr() to grab any random chunk:
> > SELECT substr(foo_bytea, 20000, 100) from foo where foo_id = 42;
> 
> This would imply that every little read would have to do a scan on a table 
> to find the row and then to perform the substr. An open command can 
> be optimized a lot more, for example to cache entries that have been 
> opened so that it's fast to read the next 1kb or whatever you want.
> 
> Also, the above does not solve writes at all which can also be made 
> transaction safe and fast with a better api where you can update a part 
> of a field.

I brought it up once and Tom answered that TOAST tables are not
transaction aware, so you can't update just some parts of toasted
entities - you must always write the whole fields. So this will not be
just an api change.

> The above is not really a replacement of the current lo-objects.

True.

-------------
Hannu



Re: random access - bytea

От
Dennis Bjorklund
Дата:
On Sun, 26 Oct 2003, Hannu Krosing wrote:

> I brought it up once and Tom answered that TOAST tables are not
> transaction aware, so you can't update just some parts of toasted
> entities - you must always write the whole fields. So this will not be
> just an api change.

Yes, the blocks (or what one selects to work with) in a toasted field have
to have transaction stuff stored, just like tuples. That is clear if
different transactions shall have different views of the "file".

-- 
/Dennis