Discussion:
Hive equivalent of group_concat() ?
Saurabh S
2012-05-11 20:02:56 UTC
Permalink
As far as I understand, there is no equivalent of MySQL group_concat() in Hive. This stackoverflow question is from Sept 2010: http://stackoverflow.com/questions/3703740/combine-multiple-rows-into-one-space-separated-string

Does anyone know any other method to create a delimited list from from table rows? There are a few methods listed in Anthony Molinaro's SQL Cookbook, but all of them use over(partition by) method which is also not available in Hive.
Edward Capriolo
2012-05-11 20:05:54 UTC
Permalink
The main issue with group_concat is that aggregates have to keep each
column in memory and that is a big problem. If The user knows the list
will be small you could write a UDAF like collectset, collect which
puts each value into a list and then you can lateral view that list.

Edward
Post by Saurabh S
As far as I understand, there is no equivalent of MySQL group_concat() in Hive. This stackoverflow question is from Sept 2010: http://stackoverflow.com/questions/3703740/combine-multiple-rows-into-one-space-separated-string
Does anyone know any other method to create a delimited list from from table rows? There are a few methods listed in Anthony Molinaro's SQL Cookbook, but all of them use over(partition by) method which is also not available in Hive.
Edward Capriolo
2012-05-14 01:04:31 UTC
Permalink
I just through a derivative of collectset called collect but we need
some type of string join that takes an array and a delimiter.

https://github.com/edwardcapriolo/hive-collect
Post by Edward Capriolo
The main issue with group_concat is that aggregates have to keep each
column in memory and that is a big problem. If The user knows the list
will be small you could write a UDAF like collectset, collect which
puts each value into a list and then you can lateral view that list.
Edward
Post by Saurabh S
As far as I understand, there is no equivalent of MySQL group_concat() in Hive. This stackoverflow question is from Sept 2010: http://stackoverflow.com/questions/3703740/combine-multiple-rows-into-one-space-separated-string
Does anyone know any other method to create a delimited list from from table rows? There are a few methods listed in Anthony Molinaro's SQL Cookbook, but all of them use over(partition by) method which is also not available in Hive.
Edward Capriolo
2012-05-14 01:25:23 UTC
Permalink
Actually its collect or collectset + concat_ws udf.

Edward
Post by Edward Capriolo
I just through a derivative of collectset called collect but we need
some type of string join that takes an array and a delimiter.
https://github.com/edwardcapriolo/hive-collect
Post by Edward Capriolo
The main issue with group_concat is that aggregates have to keep each
column in memory and that is a big problem. If The user knows the list
will be small you could write a UDAF like collectset, collect which
puts each value into a list and then you can lateral view that list.
Edward
Post by Saurabh S
As far as I understand, there is no equivalent of MySQL group_concat() in Hive. This stackoverflow question is from Sept 2010: http://stackoverflow.com/questions/3703740/combine-multiple-rows-into-one-space-separated-string
Does anyone know any other method to create a delimited list from from table rows? There are a few methods listed in Anthony Molinaro's SQL Cookbook, but all of them use over(partition by) method which is also not available in Hive.
Loading...