Cautionary remarks of interpreting the survey results
There are a few cautionary remarks worth paying attention to when reading the number of Regional Headquarters(RHQs) /Regional Offices(ROs)/Local Offices(LOs) in the report of “Annual survey of companies in Hong Kong with Parent Companies Located outside Hong Kong” ( called the Survey hereafter), published by Census and Statistics Department (C&SD), that may lead to incorrect statistics or comparison.
What are the sampling methods in the Survey?
Referring to different sources, C&SD constructs a sampling frame containing companies that have parent companies located outside Hong Kong. All companies in the sampling frame will be selected to complete a questionnaire, in order to identify and count the number of RHQs/ROs/LOs. However, it is not compulsory for the surveyed companies to submit the questionnaire, which means some responses might be missing.
In the 2021 Survey report, the sampling sources included the samples counted in the previous year； consulates, trade commissions and chambers of commerce of overseas countries in Hong Kong； business directories, media reports and working contacts of Invest Hong Kong； up-to-date information from the Companies Registry； other sources (eg. relevant information available from C&SD).
After receiving the questionnaire, C&SD then classifies each of the surveyed companies into three types of offices, namely RHQs/ROs/Los, based on the information reported by surveyed companies about their geographical responsibility and controlling power over offices or operations in the region.
The 2021 report stated that, by mid-September 2021, 9049 companies were successfully counted, while around 251 companies did not respond. As such, the non-respond rate in 2021 is 2.26%, which is lower than 4.24% in 2020.
To find out the precise number of RHQs/ROs/LOs in Hong Kong, what does the ideal survey look like?
Prior to examining the cautions in the Survey, we should understand how the ideal survey, which is capable of getting the precise number of RHQs/ROs/LOs, looks like. Then we can serve this ideal survey as an anchor, to measure how distant the survey of C&SD is from the ideal one.
Regarding the sampling methods, in the ideal survey, the underlying population (all companies which have parent companies located outside Hong Kong, except for those being excluded) is included in the survey (no coverage error). The sampling frame equals the population, and the number of samples received equals the number of intended samples which means the response rate is 100% (no unit nonresponse error).
Moreover, the timing that companies fill in the questionnaire remains the same in every survey (no time period bias), in order to avoid the bias arising from comparing between the peak and normal periods for company establishment or closure.
Regarding the interviewers, they own perfect information about the locations of the surveyed companies and their parent companies, surveyed companies’ major line of business and their controlling power over other offices. After collecting data, they accurately input all the data into a database to compile the statistics (no processing error).
Regarding the respondents (the staff who fill in the questionnaire), they own perfect information about their companies’ major line of business and controlling power over other offices, answer every question (no item nonresponse error) in the survey honestly and accurately (no measurement error), as well as submit the questionnaire on time.
Regarding the survey questionnaires, the wording and content of the questionnaire are not misleading and not ambiguous.
What cautionary remarks should we be aware of when interpreting the survey results?
However, in the real world, it is virtually impossible to have the ideal survey as described above, due to many constraints facing every aspect.
Regarding the constraints in sampling methods, first of all, according to the Survey, an up-to-date, complete and accurate sampling frame including the entire population is not available, which means the sampling frame does not represent the underlying population and might lead to the coverage error.
In the report, the number of RHQs/ROs/LOs only represents the best estimate that could be taken at the time of the survey. Being unable to encompass the entire population in the sampling frame means that the number counted might be less than the actual total number of RHQs/ROs/LOs.
Secondly, the sampling sources used to construct the sampling frame are subject to adjustment at any year. Failure to include new samples into the sampling frame on a timely basis could result in mistaken estimations in different years.
For example, there was a new sample eligible to be included in the sampling frame in 2020, but the C&SD ended up adding it into the sampling frame in 2021. In this case, the incremental and total number of RHQs/ROs/LOs counted in 2020 was understated, and the increment of RHQs in 2021 was overstated and could not truly reflect the latest changes in the past year.
Failures to obtain all samples from all sampling sources could lead to estimation errors as well. Taking the sampling source of the consulate as an example, let’s assume there are 100 firms each from Japan and Mexico in Hong Kong. if the consulate of Japan is perfectly cooperative and submit a list with 100 samples to C&SD, whereas the consulate of Mexico is less cooperative and submit a list with only 50 samples, the number of Mexican firms will be understated and the ratio of Japanese firms to Mexican firms will be overstated.
In the Survey, the sampling sources have remained unchanged over most of the years, except that the “working contacts of Invest Hong Kong” were included in the sampling frame since 2002, but not included before that. It should be noted that Invest Hong Kong was established in July 2000, which works with overseas and Mainland entrepreneurs, SMEs and multinationals that wish to set up an office or expand their existing business in Hong Kong.
Thirdly, regarding the constraints of respondents, due to the voluntary nature of the Survey, there were some respondents who did not submit the questionnaires every year and the non-response rate was varying over time. As far as I understand, while some of the nonresponse samples will be classified as RHQs/ROs/Los based on the best knowledge of C&SD, some nonresponse samples might be ignored.
In the case that the companies which complete the survey are systematically different from those which choose not to submit the survey, the number of RHQs/ROs/Los might suffer from the non-response error. For example, if generally speaking, local offices tend not to respond due to insufficient manpower, the number of local offices might be understated given that no adjustment to the number is made.
If the companies that choose not to respond are random, the number of RHQs/ROs/LOs could be adjusted to mitigate the level of the error, by imputing the number of each type of office for those nonresponse companies and adding it into the finalized statistics.
Fourthly, regarding the constraints of interviewers, as to whether the respondents correctly and honestly answer the survey questions, because C&SD do not have perfect information about the surveyed companies, it, therefore, cannot verify the truthfulness of the information easily.
Fifthly, as a part of the nonresponse companies will be categorized as RHQs/ROs/LOs based on C&SD’s understanding and best knowledge. It means a part of classification processes lies in individual judgement, that could be subjective or wrong without perfect information about the surveyed companies’ major line of business and its controlling power over other offices in the region.
Sixthly, in regard to the timing of filling in the questionnaire, despite the reference period has been set in the beginning of June, the questionnaires were sent to the surveyed companies in late May to early June every year and were collected during late August to early October, in between which there were 3 to 4 months for the companies to fill in the questionnaire. It is possible that some companies filled in the form in June and some fill in later, which means the timing of measuring the information may not be the same, which could dampen the comparability of the number of RHQs/ROs/Los in different years.
All in all, due to the potential problems of sampling frame incompleteness and arbitrary adjustment, the non-response error, and mistaken classification without up-to-date information and so on, the number of RHQs/ROs/LOs might be incorrect and non-comparable over different years.
Below is the table that summarizes the potential errors in the statistics of Regional Headquarters, Regional Offices and Local Offices by C&SD
|Constraints||Ideal survey||Real-world survey of C&SD||Potential Errors|
|Sampling Frame||Entire population||Part of population||Coverage error|
|Update changes immediately||Might update changes with a delay|
|Interviewers||Possess complete information on responding companies||Possess incomplete information of responding companies||Measurement error|
|Precisely process and record all information||Information processing and recording might be mistaken||Processing error|
|Respondents||Possess complete information of working companies||Possess incomplete information of working companies||Measurement error|
|Precisely and honestly answer all questions||Answers might be mistaken due to bad memories or lies.|
|Timing to fill in questionnaires||Fixed timing every year||Start at around June to October||Time period Bias|
|Response rate of questionnaires||100%||Lower than 100%||Non-Response error|