|
我想出了一些可行的方法,但并非完全符合我的期望。这是我的解决方案:
% |9 G. g# ?! o7 R. j2 E. V; B: G#standardSQL( N2 `& u" k# `: r
UPDATE
% k( W2 ]% e' u% h4 y `attribution.daily_sessions_20180301_copy1` AS target
7 h; b4 Z8 j8 M$ X+ d+ `5 uSET
) [% t7 B" I; h( s0 t3 dhits = ARRAY(
! s! S9 E _. @- P& Y8 v SELECT AS STRUCT * REPLACE(ARRAY(
: n( f) T% n0 P) e& x) R* `: ^ SELECT AS STRUCT *
, _8 l" k; l& w5 n. ~ FROM(, o4 E1 T( a$ b/ s5 O( k
SELECT AS STRUCT * REPLACE(map.category AS productCategoryAttribute) FROM UNNEST(product))) AS product) FROM UNNEST(hits)
' e5 t' t* t$ ?/ |+ l0 D9 U1 W)6 B- ~) e+ C! J0 a' T" B M1 Z
FROM
0 V2 A& g6 X7 I4 \1 I% m1 L `attribution.attribute_category_map`
# ?& w0 y/ l! v* hAS map
3 n$ L, b: x# u2 `WHERE, a2 l" L3 D9 P! f# q& }
(
7 n* U$ u# W' F: X SELECT REPLACE(LOWER(prod.productCategory), 'amp;', '') FROM UNNEST(target.hits) AS h,1 J: F& I( p# P1 ]+ T
UNNEST(h.product) AS prod LIMIT 1) = map.raw_name
$ P" k2 w- j) \; @+ Tattribute_category_map是一个具有两列的表,我在其中查找第1列中的对应值,并用第2列中的值替换目标表中的数据。我实现的最佳结果-5 J! W4 Q3 q, m! N
用相同的值更新了一行中的所有嵌套字段,这仅适用于第一个嵌套字段,而不是使用特定值更新每个嵌套字段。! T" q- Z* O3 ?* N! [8 w; K2 c
主表的简化架构:
1 C/ L% v# `: U( i% r7 r[
0 I- Q0 h" Z& n; M# \: j! K { 4 h! t! v) k' E9 n: ?
"name":"sessionId",
% u1 \- A, t6 K/ S6 _! N) A, x "type":"STRING",- J/ Z) P7 j" n) P4 `
"mode":"NULLABLE"
6 C8 L' k/ A# Z% a1 U },
8 r4 b( E3 f* P" q" Y2 ] { . t% b! p1 m0 ?- [
"name":"hits",
: d* T! R- [' E0 d# D; }) R "type":"RECORD",* @( t6 N: ^4 y1 X
"mode":"REPEATED",
" Z" @5 R' ~5 F1 N7 e: | "fields":[ / ?* H/ R3 Q5 Y0 Y1 I2 O# X
{ 4 v5 h1 |/ f: m
"name":"product"," h9 m& [5 Y6 f2 {2 `" M
"type":"RECORD",- f& V' G: v: g/ G) {7 K
"mode":"REPEATED",
7 h- S" z! q) Y8 c "fields":[ : t" Q! V* S1 {1 f/ q0 Q$ e
{
& D' r5 u. O1 ]6 d; l) }5 g" x "name":"productCategory",
6 L: F" f) Y1 u0 U% o "type":"STRING",5 m1 z1 j& b; n! l$ Y0 M1 M3 B9 `
"mode":"NULLABLE", o! Z8 X% ]4 R6 J& w" `
},
, j" ^' B" v! A9 M { ! u K2 e$ d2 U* N2 k, r1 s- o
"name":"productCategoryAttribute",
6 j/ G. r$ r. H% D& ~- [) h "type":"STRING",
; W }% g! z! g4 }, M" ~ "mode":"NULLABLE"
" n. D; b9 r" {7 K, y/ l }
, D: y* i3 A( C; H8 z$ Q- q4 c ]/ a7 X7 E( \* @; I
}
: G: |) W: r; f% n' K1 B( t ]% D9 K6 z' ^, C$ R& Z" \4 J$ Z
}6 A; y% n0 D+ J3 n# H* x
]
* {* d; H- s- A( S# i4 i9 i& F会话行中通常有多个匹配项,一个匹配项中通常包含多个产品。看起来像那些值(如果您嵌套):, ^) G# ], C9 I# p$ ?
-----------------------------------------------------------------------------
' U u1 Y/ ?. f0 I6 U' m" E( RsessionId | hits.product.productCategory| hit.product.productCategoryAttribute5 T: c4 a; g3 ^; H$ E, f
-----------------------------------------------------------------------------
- B8 J$ X' s+ i; e& Z1 | automotive chemicals | null
: D0 L) [- y1 w5 u U+ H1 | automotive tools | null
1 K: C! i" m0 P- V( s3 J S1 | null | null9 ?( }; N2 N. j. H
2 | null | null
, m. d" e/ O1 j# N) E; t, v2 | automotive chemicals | null& r* B0 s1 c, }2 o" ^
2 | null | null2 h# H( |' f. p+ _$ b1 R
3 | null | null0 p& N5 i4 T6 a! K2 d
3 | bed accessories | null7 d! L# ^( x3 j6 Z$ B
4 | null | null
, z' ]3 E3 c0 x4 | null | null c5 S$ U5 ?% L/ y, d% z
4 | automotive chemicals | null. f0 I7 F% s, `: R) w7 \
4 | null | null. \: |' c% Y- L; ~! [8 `
-----------------------------------------------------------------------------8 l, d; ?$ b) Z) j6 P
映射表的架构:4 e- X6 q6 F! r9 o$ m' Z$ A6 A/ G
[ ) X+ R( V2 @; T: Z
{
; G: J' [8 P& H! @/ O; q, R "name":"raw_name", w7 ^, F/ O6 V9 p8 U
"type":"STRING",
, C6 [8 i& ]- D c Z$ g6 b "mode":"NULLABLE"
: ?% ^- c* @) }5 _! ], P },
5 y8 d* C9 [2 B# v { 1 v. ?; W' z5 J9 s j! R
"name":"category",
) Q$ [2 o' _ _' O! x9 E1 A' p "type":"STRING",
k; }9 h4 L% e& S "mode":"NULLABLE"5 r+ N# L8 ?- ~ _: y' K R; v% ~
}( Q [8 Z( N6 }1 A1 f( b% G
]1 O3 n) g$ W+ K" y& I$ M
具有这样的值:! J* n. e; {: S: @3 c& M- F! K
---------------------------------------------------0 P9 Z. y" E3 p' Y% ]
raw_name |category |0 D8 T. N0 a: | q" a
---------------------------------------------------# m5 e: @) T) V! x
automotive chemicals |d1y2 - automotive chemicals|
# R0 e* d* J( |6 Y" S8 Oautomotive paint |dijf1 - automotive paint |
3 N; M' s$ P+ x: W8 R9 V; ?automotive tools |efw1 - automotive tools |
! p N* }2 r/ s0 ?+ `. |" Qbaby & infant toys |wwfw - baby & infant toys |
7 x5 ?' N: t1 g/ k5 Xbatteries & power |fdsv- batteries & power |
7 }( b1 z( t- r6 S) v9 k" pbed accessories |0k77 - bed accessories |
9 ^0 y' ~; h& Bbike racks |12df - bike racks |
: T% z( c# j0 _) J. h--------------------------------------------------
6 H. q" `5 {/ D9 q' o结果是我想要什么:! E2 d: T" |) g( U) ?) v* p
-----------------------------------------------------------------------------* M% B8 ]1 B& ~8 f9 F
sessionId | hits.product.productCategory| hit.product.productCategoryAttribute
5 K; `. U6 `0 ]& V0 j; P3 e1 {-----------------------------------------------------------------------------
6 x" F/ F3 [: ]6 ?$ A0 W9 w2 w& o 1 | automotive chemicals | d1y2 - automotive chemicals: k/ w$ e2 L3 c1 F1 e2 n' l
1 | automotive tools | efw1 - automotive tools
7 r: Z3 g' t* f0 w 1 | null | null. I. ^1 d% d% v* n. I
2 | null | null3 p. O# t4 a" b) ^' [+ p
2 | automotive chemicals | d1y2 - automotive chemicals% ^- s+ u, e6 J) _
2 | null | null
: u+ I+ ]. B/ T k 3 | null | null
8 s1 J* v. h2 h9 D1 d6 M) E) ~) l1 ~ 3 | bed accessories | 0k77 - bed accessories
8 k) v0 o7 ]+ u! ?- d. U 4 | null | null; }+ K) H3 n p! b, L6 J! _; m/ d
4 | null | null/ S* M& ], X. @* P8 W x( P' @
4 | automotive chemicals | d1y2 - automotive chemicals: q% t) m& F* |& ]. W
4 | null | null/ L. `$ Y+ ?& Q; G$ [+ t1 z) ?
-----------------------------------------------------------------------------
6 @3 P. s a% a; `4 G1 N( t我需要从主表中获取值productCategory,在map表中的raw_name列中查找它,从colum类别中获取值,并将其放入主表的productCategoryAttribute列中。主要问题是目标字段是双重嵌套的,我不知道如何直接加入它们。
& u4 X7 ` `3 E. _. o+ r & N+ V1 R R' H1 {
解决方案:. ~1 ?1 Q3 R* p- e( A5 |$ I, ]5 x
% a6 F: `9 Y5 | U6 ]
$ A: k: R& b0 i- c. A$ L9 z+ {: h+ B
: H2 \5 s0 j6 O2 m! v# T 下面经过测试!0 U* g$ I/ s6 V: g5 B6 a# V
保留整个表的架构/数据不变,并且仅根据各自的映射更新productCategoryAttribute的值
3 E4 i8 h5 t# R1 |. {#standardSQL9 n4 H p) d" \: W
UPDATE `project.dataset.your_table` t
8 u6 d7 [* \) H: C& PSET hits =
8 [0 G; C+ O+ N+ M; L ARRAY(6 s* ?0 J; f2 Q/ X! \1 S8 L N
SELECT AS STRUCT * REPLACE(: K2 T6 v8 m9 x" @5 j1 I9 k
ARRAY(+ G& m$ S5 p' w! N$ b, U7 s) A4 q
SELECT AS STRUCT product.* REPLACE(
; b! z0 c4 e* T, V' v CASE WHEN map.raw_name = product.productCategory THEN category L$ \5 y+ s- i6 Z. {
ELSE productCategoryAttribute END AS productCategoryAttribute)
1 K) `4 z4 `/ p$ z) O. o FROM UNNEST(product) product9 a3 w- ]6 J$ A# f& G
LEFT JOIN UNNEST(agg_map.map) map
: ^& A0 N( Y- }; b9 d ON map.raw_name = product.productCategory
6 H7 k, R& h/ F8 s% p& I9 u ) AS product)
- T. t# }4 |3 U* Q FROM UNNEST(hits) hit
5 B$ G) x/ Q% O; u' I5 q )
' j8 I9 @3 j) Z. s% ]FROM (SELECT ARRAY_AGG(row) map FROM `project.dataset.map` row) agg_map
" C. J2 l6 c- NWHERE TRUE
8 S9 M2 B: Z! k. |' S注意:以上解决方案假定映射表不是那么大,因为它依赖于将整个映射表聚合到一个数组中 |
|