Discussion:
How to create auto increment key for a table in hive?
Jone Zhang
2017-04-11 09:02:19 UTC
Permalink
The hive table write by many people.
How to create auto increment key for a table in hive?

For example
create table test(id, value)
load data v1 v2 into table test
load data v3 v4 into table test

select * from test
1 v1
2 v2
3 v3
4 v4
...


Thanks
Luis
2017-04-12 10:04:25 UTC
Permalink
Hi Jone,
I'd like to remember that Hive supports ACID (in a very early stages yet)
but most often that is a feature that most people don't use for real
production systems.
I think there is nothing for "auto-increment keys" like RDBMS due the
nature of parallelism of Hive and Hadoop ecosystem. Still you can try for
your use-case to write your own UDF.

Apart from that you can generate UUID with reflect UDF:
reflect("java.util.UUID", "randomUUID")

Thanks,
Luis
Post by Jone Zhang
The hive table write by many people.
How to create auto increment key for a table in hive?
For example
create table test(id, value)
load data v1 v2 into table test
load data v3 v4 into table test
select * from test
1 v1
2 v2
3 v3
4 v4
...
Thanks
--
Cumprimentos,
Luís Marques
Gopal Vijayaraghavan
2017-04-12 10:41:47 UTC
Permalink
I'd like to remember that Hive supports ACID (in a very early stages yet) but most often that is a feature that most people don't use for real production systems.
Yes, you need ACID to maintain multiple writers correctly.

ACID does have a global primary key (which is not a single integer) - ROW__ID.

select ROW__ID, * from acid_table;

will return a unique value for each row.

Cheers,
Gopal

Loading...