Apache HDFS
Last updated
Last updated
A SELECT statement can consist of the following basic clauses.
SELECT
INTO
FROM
JOIN
WHERE
GROUP BY
HAVING
UNION
ORDER BY
LIMIT
The following syntax diagram outlines the syntax supported by the SQL engine of the provider:
Return all columns:
Rename a column:
Cast a column's data as a different data type:
Search data:
Return the number of items matching the query criteria:
Return the number of unique items matching the query criteria:
Return the unique items matching the query criteria:
Summarize data:
See Aggregate Functions below for details.
Retrieve data from multiple tables.
See JOIN Queries below for details.
Sort a result set in ascending order:
Restrict a result set to the specified number of rows:
Parameterize a query to pass in inputs at execution time. This enables you to create prepared statements and mitigate SQL injection attacks.
Returns the number of rows matching the query criteria.
Returns the number of distinct, non-null field values matching the query criteria.
Returns the average of the column values.
Returns the minimum column value.
Returns the maximum column value.
Returns the total sum of the column values.
The Provider for HDFS supports standard SQL joins like the following examples.
An inner join selects only rows from both tables that match the join condition:
A left join selects all rows in the FROM table and only matching rows in the JOIN table:
The following date literal functions can be used to filter date fields using relative intervals. Note that while the <, >, and = operators are supported for these functions, <= and >= are not.
The current day.
The previous day.
The following day.
Every day in the preceding week.
Every day in the current week.
Every day in the following week.
Also available:
L_LAST/L_THIS/L_NEXT MONTH
L_LAST/L_THIS/L_NEXT QUARTER
L_LAST/L_THIS/L_NEXT YEAR
The previous n days, excluding the current day.
The following n days, including the current day.
Also available:
L_LAST/L_NEXT_90_DAYS
Every day in every week, starting n weeks before current week, and ending in the previous week.
Every day in every week, starting the following week, and ending n weeks in the future.
Also available:
L_LAST/L_NEXT_N_MONTHS(n)
L_LAST/L_NEXT_N_QUARTERS(n)
L_LAST/L_NEXT_N_YEARS(n)
SELECT
{
[ TOP
<numeric_literal> | DISTINCT
]
{
*
| {
<expression> [ [ AS
] <column_reference> ]
| { <table_name> | <correlation_name> } .*
} [ , ... ]
}
[ INTO
csv:// [ filename= ] <file_path> [ ;delimiter=tab ] ]
{
FROM
<table_reference> [ [ AS
] <identifier> ]
} [ , ... ]
[ [
INNER
| { { LEFT
| RIGHT
| FULL
} [ OUTER
] }
] JOIN
<table_reference> [ ON
<search_condition> ] [ [ AS
] <identifier> ]
] [ ... ]
[ WHERE
<search_condition> ]
[ GROUP
BY
<column_reference> [ , ... ]
[ HAVING
<search_condition> ]
[ UNION
[ ALL
] <select_statement> ]
[
ORDER
BY
<column_reference> [ ASC
| DESC
] [ NULLS FIRST
| NULLS LAST
]
]
[
LIMIT <expression>
[
{ OFFSET | , }
<expression>
]
]
}
<expression> ::=
| <column_reference>
| @ <parameter>
| ?
| COUNT( * | { [ DISTINCT
] <expression> } )
| { AVG
| MAX
| MIN
| SUM
| COUNT
} ( <expression> )
| NULLIF
( <expression> , <expression> )
| COALESCE
( <expression> , ... )
| CASE
<expression>
WHEN
{ <expression> | <search_condition> } THEN
{ <expression> | NULL
} [ ... ]
[ ELSE
{ <expression> | NULL
} ]
END
| <literal>
| <sql_function>
<search_condition> ::=
{
<expression> { = | > | < | >= | <= | <> | != | LIKE
| NOT
LIKE
| IN
| NOT
IN
| IS
NULL
| IS
NOT
NULL
| AND
| OR
| CONTAINS
| BETWEEN
} [ <expression> ]
} [ { AND
| OR
} ... ]
SELECT * FROM Files
SELECT [ChildrenNum] AS MY_ChildrenNum FROM Files
SELECT CAST(Length AS VARCHAR) AS Str_Length FROM Files
SELECT * FROM Files WHERE FileId = '119116'
SELECT COUNT(*) AS MyCount FROM Files
SELECT COUNT(DISTINCT ChildrenNum) FROM Files
SELECT DISTINCT ChildrenNum FROM Files
SELECT ChildrenNum, MAX(Length) FROM Files GROUP BY ChildrenNum
SELECT c.Owner, o.OwnerRead, o.OwnerWrite, o.OwnerExecute FROM Files c INNER JOIN Permissions o ON c.FullPath = o.FullPath
SELECT FileId, ChildrenNum FROM Files ORDER BY ChildrenNum ASC
SELECT FileId, ChildrenNum FROM Files LIMIT 10
SELECT * FROM Files WHERE FileId = @param
SELECT COUNT(*) FROM Files WHERE FileId = '119116'
SELECT COUNT(DISTINCT FileId) AS DistinctValues FROM Files WHERE FileId = '119116'
SELECT ChildrenNum, AVG(Length) FROM Files WHERE FileId = '119116'
GROUP BY ChildrenNum
SELECT MIN(Length), ChildrenNum FROM Files WHERE FileId = '119116'
GROUP BY ChildrenNum
SELECT ChildrenNum, MAX(Length) FROM Files WHERE FileId = '119116'
GROUP BY ChildrenNum
SELECT SUM(Length) FROM Files WHERE FileId = '119116'
SELECT c.Owner, o.OwnerRead, o.OwnerWrite, o.OwnerExecute FROM Files c INNER JOIN Permissions o ON c.FullPath = o.FullPath
SELECT c.Group, o.GroupRead, o.GroupWrite, o.GroupExecute FROM Files c LEFT JOIN Permissions o ON c.FullPath = o.FullPath
SELECT * FROM MyTable WHERE MyDateField = L_TODAY()
SELECT * FROM MyTable WHERE MyDateField = L_YESTERDAY()
SELECT * FROM MyTable WHERE MyDateField = L_TOMORROW()
SELECT * FROM MyTable WHERE MyDateField = L_LAST_WEEK()
SELECT * FROM MyTable WHERE MyDateField = L_THIS_WEEK()
SELECT * FROM MyTable WHERE MyDateField = L_NEXT_WEEK()
SELECT * FROM MyTable WHERE MyDateField = L_LAST_N_DAYS(3)
SELECT * FROM MyTable WHERE MyDateField = L_NEXT_N_DAYS(3)
SELECT * FROM MyTable WHERE MyDateField = L_LAST_N_WEEKS(3)
SELECT * FROM MyTable WHERE MyDateField = L_NEXT_N_WEEKS(3)