Phonetic realisations of Madurese vowels and their implications for the Madurese vowel system

It has been suggested that Madurese has eight surface vowels [a, ɛ, ə, ɔ, ɤ, i, ɨ, u], but there have been disagreements with regard to the number of its vowel phonemes. The disagreements arise partly because some scholars base their analyses of Madurese vowels on phonetic grounds while others base them on certain phonological analyses. Besides, some researchers do not consider native versus non-native Madurese words in their analyses. The paper addresses these problems by incorporating both phonetic and phonological analyses in order to provide a better description of Madurese vowels. To achieve this, we investigated the acoustic realisations of the eight surface vowels by looking at the first and second formant frequencies (F1 and F2) of the high and non-high vowel pairs (i ~ ɛ, ɨ ~ ə, ɤ ~ a, u ~ ɔ). Fifteen speakers of Madurese were recorded reading Madurese words put in a carrier phrase. All segmentations were done employing Praat, and F1 and F2 values were extracted using a Praat script. The data were assessed with a linear mixed-effects model accounting for variation due to both random and fixed factors. The results showed that all high and non-high vowel pairs significantly differed in their F1 values. However, the results for F2 values showed variations; only the pair [ɨ ~ ə] showed a significant difference at vowel onset and at vowel midpoint the pairs [i ~ ɛ] and [ɨ ~ ə] were significantly different. Furthermore, we also looked at the vowels [ɤ] and [ɨ] as well as [ɤ] and [ə] to see if they differed in their F1 and F2 values. Our results confirmed that at both vowel onset and midpoint, they were significantly different. The results were discussed employing phonological analysis and vowel dispersion theory. The result of the analyses suggests that Madurese should be best described as a language with a four-vowel system and further offers a solution to the disagreements on the number of vowel phonemes in Madurese


INTRODUCTION
Most previous work agrees that Madurese has eight surface vowel qualities, but researchers differ as to the number of vowel phonemes it has. Such differences may partly arise because some researchers base their distinction of Madurese vowels purely on sounds as found in lexical items while some others probably base the vowel distinction on a particular phonological analysis of the language. The disagreements also result from the fact that some researchers do not distinguish between native vowels of Madurese and non-native ones that are found in some loanwords.
As shown in Table 1, Madurese vowels can be Table 1 Madurese Surface Vowels (Stevens, 1980;Cohn & Lockwood, 1994)  A quite different view with regard to vowel phonemes and their alternations in Madurese is proposed by Anderson (1991). She claims that the 'default' vowels in this language consist of three nonhigh vowels /ɛ, a, ɔ/ which surface as [ɛ, a, ɔ] and [i, ʌ, u] and that there is no distinction between ə and ɨ. Following Kiliaan (1897), Anderson argues that the vowel /ə/ does not alternate and hence it can occur after voiced and voiceless aspirated stops. It is also important to note that Anderson uses the IPA symbol [ʌ] instead of [ɤ]. In contrast, Davies (2010, pp. 36-37) argues that Madurese has six phonemic vowels, namely /ɛ/, /ɔ/, /a/, /ə/, /i/, /u/. Unlike Stevens, Davies includes /i/ and /u/ in the Madurese vowel inventory arguing that they are also found in word-initial position. He shows that these two vowels are particularly found in Madurese loanwords such as [imigrasi] 'immigration ' and [uɟiɤn] 'exam'.
The differences discussed above suggest that the phonetic and phonological status of Madurese vowels requires further research. A part from this, previous instrumental studies on Madurese (e.g. Cohn, 1993a;Cohn 1993b;Cohn & Lockwood 1994) only involved one or two speakers of Madurese and we address this problem by recruiting a relatively large number of speakers as the data collected would be more representative for statistical analysis purposes.
Thus, the goal of the study was to investigate the acoustic realisations of the eight surface vowels by looking at the first and second formant frequencies (F1 and F2) of the high and non-high vowel pairs, i.e.
[i ~ ɛ, ɨ ~ ə, ɤ ~ a, u ~ ɔ] ( Table 2 provides some examples of the vowel alternations in Madurese words). The results of these analyses can provide a more definitive description for each of the vowel pair of high and non-high vowels, i.e. how they look like in the vowel space. Furthermore, they will provide an answer to the question whether the three central vowels, i.e. [ɨ, ə, ɤ], which impressionistically sound relatively similar, can in fact be distinguished by their F1 and F2 values. This is important since scholars have some disagreement about the phonetic and phonological status of these Madurese vowels in particular (Anderson, 1991;Cohn, 1993a;Davies, 2010;Stevens, 1968). More importantly, the results can provide a solid description on how the vowel system of Madurese should be best described in light of this acoustic data.

METHOD
Fifteen native speakers of Madurese (eight males, seven females) participated in this study. They came from Bangkalan, Sampang, Pamekasan and Sumenep. Besides Madurese, they also spoke Indonesian and learnt English at school or university. However, they predominantly used Madurese in their daily lives. The participants were recorded in a quiet room using Marantz PMD661 audio recorder with a Shure SM10A microphone. They were instructed to read 188 disyllabic words embedded in a carrier phrase. They were asked to read them in three random repetitions as fluently and naturally as possible with declarative intonation. The results of the recordings were then segmented and coded manually using Praat Version 6.0.54 (Boersma & Weenink, 2019) focusing on the measurements of the first two formant frequencies, i.e. the first formant (F1) and the second formant (F2). F1 and F2 values were extracted using a Praat script which was modified when necessary to fit the purpose. The results of the F1 and F2 measurements were analysed with a linear mixed effects model, using the lme4 package (Bates et al., 2015) for R (R Core Team, 2015). To obtain p-values and perform post-hoc tests, the lmeans package (Lenth, 2016) was employed. In this case, a fixed effect was regarded significant at α = 0.05.

FINDINGS AND DISCUSSION
This section presents the findings and discussion and is structured in the following manner. Firstly, we look at the descriptive statistics for the overall measurement results on F1 and F2 values of the vowels under study. Secondly, using a fixed-effects model, we try to discover whether there are significant differences in F1 and F2 values between the high and non-high vowels as well as between the three central vowels at both vowel onset and vowel midpoint. Thirdly, we further discuss the implications of the findings on the Madurese vowel system. That is, based on the acoustic evidence and guided by its phonology, we propose how the Madurese vowel system should be better described.
In addition, we also discuss it based on vowel dispersion theory. Figure 1 shows the acoustic space of the eight surface vowels of Madurese and illustrates in particular the differences between the pairs of high and non-high vowels (i ~ ɛ, ɨ ~ ə, ɤ ~ a, u ~ ɔ) pooled across speakers, places of articulation and repetitions. The data came from female and male speakers plotted separately and F1 and F2 were sampled over the course of the vowels. The vertical axis stands for the first formant frequency while the horizontal axis represents the second formant frequency. All the values have been normalised using z-transformation. The ellipses indicate one standard deviation away from the mean and each ellipse contains approximately 68.27% of the data points.

Figure 1 Distribution of Vowels Averaged over the Vowel Timecourse in a Z-Normalised F1 X F2 Space with Data from Female on the Left Panel and Male on the Right Panel (the arrows indicate the pair of non-high and high vowels.)
As shown in Figure 1 for both genders. Furthermore, if we look at individual speakers, we will also observe a lot of variations. For example, some of the ranges of variation can be seen in Figure 2, displaying the vowel plots of two speakers (UH, a female speaker and KA, a male speaker). These two speakers behave quite differently in the way they produce their central vowels in particular. The central vowels for UH are all overlapping, but KA appears to keep the central vowels relatively separated from each other.
With regard to high and non-high vowel pairs, the F1 for the non-high member of each vowel pair is consistently higher than for the high member, although the difference in magnitude between [ə] and [ɨ] is less than for the other three vowel pairs. With respect to the F2 values for high and non-high vowels, it appears there is also some variation. It is obvious that the F2 value for the vowel [i] looks higher than the vowel [ɛ] and the F2 value for the vowel [ɨ] is also higher than the vowel [ə], suggesting that the high vowels in these pairs are more fronted than the non-high vowels.

Figure 2 Distribution of Vowels Averaged over the Vowel Timecourse in a Z-Normalised F1 X F2 Space with Data from UH (Female) on the Left Panel and KA (Male) on the Right Panel (the arrows indicate the pair of non-high and high vowels.)
However, this does not seem to be really the case for the other two vowel pairs in which case we see that the F2 values for the vowel pairs [ɤ ~ a] and [u ~ ɔ] look very similar. Thus, some variations are also observed in the F2 values between the high and non-high vowels pairs, particularly between [i ~ ɛ] and [ɨ ~ ə]. However, such variations do not look to be as dramatic as those in the F1 values.

Model comparison for F1 and F2
In order to estimate the differences in F1 and F2 values for high and non-high vowels in Madurese, we compared the following linear-mixed effects models: The result of the log-likelihood ratio test in Table 3 shows that the model f1d is the maximal model justified by our data. This model includes Vowel and Place as fixed effects and as random effects it includes by-speaker and by-word random intercepts as well by-speaker random slopes for Vowel and Place. It is important to note that Place here means the place of articulation of the preceding consonants.  Figure 3 shows the vowel space of Madurese and demonstrates the differences between the pairs of high and non-high vowels (i ~ ɛ, ɨ ~ ə, ɤ ~ a, u ~ ɔ). F1 and F2 values were pooled across speakers and repetitions and were sampled at vowel onset by averaging timepoints 1-3. Table 4 provides the averaged measurement results for the first and the second formant frequencies of vowels at vowel onset. The values were pooled across places of articulation, speakers and repetitions. To compare differences in vowel height, we conducted a series of post-hoc pairwise comparisons between vowels. First, we present the pairwise comparisons between the pair of high and non-high vowels. Table 5 reports a subset of those comparisons. As seen in Table 5, the results show that there is a significant difference in F1 values between all pairs tested.  The next question that needs to be addressed is whether high and non-high vowels also significantly differed in terms of their F2 values. To confirm this, the same model was used to model F2. As shown in Table 6, the only pair for which F2 shows a significant difference in F2 at onset is the pair [ɨ] and [ə] (p < .0001). [cək:ɤʔ] 'disconnected' were also significantly different from one another. It is important to be borne in mind that these vowels do not belong to the pair of high and non-high vowels compared previously. The reason why it is also important to look at them here is because they are impressionistically very similar. This is also evident if we look at the vowel plots in Figure 3, in which both the F1 and F2 values of these vowels look considerably overlapping. In order to assess them, we used the same linear mixed-effects model described earlier.  Inferential statistics on F1 and F2 at vowel midpoint Figure 4 shows the acoustic realisations of the eight surface vowels in Madurese and displays the differences between the high and non-high vowel pairs (i ~ ɛ, ɨ ~ ə, ɤ ~ a, u ~ ɔ) at vowel midpoint. F1 and F2 values were also pooled across speakers, places of articulation and repetitions and sampled at vowel midpoint by averaging the middle four timepoints 5-7. Table 8 shows the averaged measurement results for the first and second formant frequencies of vowels measured at vowel midpoint. The values were pooled across places of articulation, speakers, and repetitions. In this regard, the same question that also needs to be addressed here is whether the high and non-high vowels have significantly different F1 values at vowel midpoint. To discover whether there was a significant difference in F1 and F2 values between high and nonhigh vowels at vowel midpoint, we fitted models as shown in Table 2 and conducted a similar series of between-vowel post-hoc tests. As seen in Table 9 above, all high and non-high vowel pairs have significantly different F1 values at vowel midpoint. The next question that needs to be addressed is whether there is a significant difference in F2 values between high and non-high vowels at vowel midpoint. As shown in

F1 and F2 as a function of Vowel and Voicing
A number of studies (e.g. (Fischer-Jørgensen, 1968;Shimizu, 1996, pp. 61-63) have shown that F1 values following voiceless stops are higher than those following voiced stops (for discussion on vowel quality and consonant voicing for non-native speakers of English, see Ryoo (2001) and for voicing and vowel raising in Sundanese see Kulikov (2010). Since voiced and voiceless aspirated stops in Madurese are both followed by high vowels, it is possible to examine these vowels as a function of voicing to see whether the two stop categories exert different effects on F1 and F2. This analysis relates to the issue on whether or not voiced and voiceless aspirated share acoustic features. That is, if F1 and F2 following voiced and voiceless aspirated stops are not significantly different, it suggests that they share the features. Figure 5 shows mean F1 and F2 values for high vowels following voiced and voiceless aspirated stops. As we can see, the F1 values following voiced stops tend to be lower than the F1 values following voiceless aspirated stops. This particularly seems to be the case for the vowels  Figure 5 in which the F1 and the F2 values for voiced and voiceless aspirated stops overlap considerably, none of the terms reached statistical significance.

Implication of results for F1 and F2 on Madurese Vowels
We have examined the first and second formant frequencies of Madurese vowels at vowel onset and vowel midpoint by looking at whether the high and  In a nutshell, the pairs of high and non-high vowels in Madurese consistently show significant differences in their F1 values. On the other hand, F2 values have been shown to vary with vowel pairs and vowel timepoints. What is also interesting here is the fact that the vowels [ɤ] and [ə], which are very similar impressionistically even though they do not constitute a pair of high and non-high vowels, demonstrate consistent differences in their F1 and F2 values at both measurement points.
As stated earlier, there has been a disagreement with respect to the number of vowel phonemes in Madurese. The disagreement has arisen partly from the fact that some researchers identify and describe Madurese vowels on the basis of surface realisations rather than based on Madurese phonology. In this article, we argue that Madurese is more economically described as a language with an underlying fourvowel system consisting of /ɛ, ə, a, ɔ/. If we also consider the vowels i and u as phonemes, this would create problems for the account of the vowel harmony processes and analysis of the onsets. That is, it simplifies the analysis of the consonants but complicates that of the vowels (Misnadin, 2017). Moreover, it is not clear whether the way the words that contain the vowels are pronounced reflect Madurese or simply the language from which the words in question have been borrowed instead. In this case, it would be reasonable to assume that they are pronounced in the way Indonesian words are pronounced given that many Madurese people also speak Indonesian.
To our observation, native speakers of Madurese rarely change Indonesian words to make them conform to the consonant-vowel (CV) interaction rule, i.e. voiced and voiceless aspirated stops are followed by high vowels while voiceless unaspirated stops and the other consonants are followed by non-high vowels (Table 2) when they speak in Madurese (see Misnadin, 2017;Misnadin & Kirby, 2020 for further discussion on this). This is particularly the case for Indonesian words borrowed from foreign languages such as Dutch and English. This may be related to the fact that Indonesian is considered to be more prestigious compared to Madurese because of the status of Indonesian as the national language. Thus, if they pronounce Indonesian words in the way native Madurese words are normally pronounced, they may feel the risk of being considered as having a low level of education or even worse. This is obvious when we have a look again at words which show exceptions to the general rule of the CV co-occurrence restriction or vowel raising below. [ The disagreement with regard to the number of vowel phonemes in Madurese partly arises from the fact that some authors also include vowels from loanwords into Madurese vowel inventory. For example, Davies (2010) argues that since [i] and [u] can also be found in word-initial position in a number of words such as [imigrasi] 'immigration' and [uɟiɤn] 'exam', these vowels need to be incorporated into Madurese phonemes as well. The question is whether it is necessary to include them as phonemes given that they are only found in loanwords in that position. In fact, there would be a price to pay for including the vowels [i] and [u] as phonemes. This is because it would be difficult to explain the existence of the two vowels on the grounds of the vowel raising rule or the CV co-occurrence restriction, making the rule more complicated than it needs to be (Misnadin, 2017). Therefore, we argue that it would be more parsimonious if we simply put the words that contain [i] and [u] in word-initial position into exceptions due to loanwords rather than categorise them as separate phonemes. Again, this needs to be done in this way if we prefer maintaining the vowel raising rule across the board in Madurese.
With regard to the vowels [ɨ] and [ə], about which previous scholarship has also questioned, we can establish that these two vowels are acoustically distinct both in terms of their F1 and F2 values. The results provide further phonetic evidence of the existence of the high vowel [ɨ] along with its nonhigh counterpart [ə]. This suggests that the vowel [ɨ] does not simply exist for convenience in the sense that every non-high vowel must have its high counterpart due to vowel height alternation under the process of vowel raising and/or lowering.
Thus, unless we take the phonology of Madurese into account particularly on how consonants interact with vowels, we may be led to conclude that Madurese, for instance, can be categorised into a language with a relatively symmetric eight-vowel system. Phonetically speaking, however, such a conclusion also makes sense given that all of the vowels are phonetically distinct in the sense that they relatively occupy their own vowel space. This is particularly obvious if we look at the five peripheral vowels, i.e. [i, ɛ, a, ɔ, u] although the three central vowels [ə, ɤ, ɨ] appear to be clustered together. Finally, it is also interesting to observe that the magnitude of the vowel raising for each vowel pair also showed variations. This may suggest that the effect of consonantal feature spreading, whatever the feature is, depends on individual vowels following the consonants. It is evident that the highest degree in vowel raising occurs to the pairs [a ~ ɤ], [ɛ ~ i] and [ɔ ~ u] respectively while the lowest occurs to the pair [ə ~ ɨ].
In addition, there are some interesting aspects that we can observe about the vowel system in Madurese particularly if we relate the Madurese system to vowel dispersion theory proposed by Liljencrants and Lindblom (1972). That is, considering Madurese only has four underlying vowels, the question is why the vowels are not dispersed as the theory predicts. Specifically, as we argue for a four-vowel system in Madurese, we should expect the vowels to include the predicted /i, ɛ, a, u/ (Liljencrants & Lindblom, 1972, p. 845). This is not the case for Madurese as its vowel system only consists of four underlying vowels which are all nonhigh, i.e. /ɛ, a, ə, ɔ/. This Madurese system is not observed in any four-vowel systems because all languages that belong to the four-vowel system always include the vowel /i/ as one of their vowel phonemes (Becker-Kristal, 2010;Liljencrants & Lindblom, 1972). In addition, the clustering together of the three central vowels [ə, ɤ, ɨ] in a relatively crowded space seems to be inconsistent with one important principle of dispersion theory that vowels have to be maximally dispersed from one another (Liljencrants & Lindblom, 1972).
It may be that the three Madurese vowels do not need to be maximally dispersed for their contrast because they have different syllable structures in the case of the vowels [ə, ɨ] versus [ɤ], i.e. the former are always followed by geminate consonants while the latter is not. On the other hand, the vowel [ɨ] is always preceded by a voiced or voiceless aspirated stop while the vowel [ə] always goes together with voiceless unaspirated stops and the other consonants. Thus, we can assume that these non-vocalic aspects may also function to maximise the perceptual differences between the three vowels.

CONCLUSION
The results of the study have confirmed that overall, the eight surface vowels of Madurese, i.e. [i, ɛ, ɨ, ə, ɤ, a, u, ɔ], can be distinguished in terms of their F1 and F2 values. Even though they have distinct phonetic vowel space, they cannot be considered to have phonemic status. This becomes obvious when we look at the phonological system of the language in which the surface vowels in fact derive from four underlying vowels, all of which are non-high vowels, i.e. /ɛ, ə, a, ɔ/. Another important result of this study is the fact that the three central vowels, i.e. [ɨ, ə, ɤ], which are impressionistically similar, have distinct F1 and F2 values although they do not look well separated in the vowel space, as can be seen in the figures shown above. However, their distinction is probably maximised by the fact that each has distinct syllable structure discussed above. Based on these analyses, we propose that Madurese has a four-vowel system, and this offers a solution to the disagreements on the number of its vowel phonemes. This system is quite unique if we compare it with a number of vowel systems in the world's languages (see Liljencrants and Lindblom (1972). And indeed, the Madurese vowel system provides a challenge to the theory of vowel dispersion proposed by Liljencrants and Lindblom (1972) discussed previously.
There are further possible studies which can be pursued on the basis of the present findings. As this study does not particularly look at different Madurese dialects, it will be interesting to see if different dialects may show different acoustic realisations in their vowels, hence different vowel systems. In this case, Kettig and Winter (2017) provides a methodological example that can be consulted to do language variation based on gender, generation, and race. Other relevant works on this include Jacewicz et al. (2011), Alcorn et al. (2020, Chung (2020), and Thomas (2020). Another interesting aspect that can be looked at is how speakers of Madurese perceive Madurese vowels, especially the three central vowels which impressionistically sound similar. This perception study is important to do in order to see if there is a mismatch between production and perception in their realisations or if some speakers do not distinguish them at all. There have been a number of relevant studies that deal with these perception and production phenomena, for example Clopper and Dossey (2020), Gunter et al. (2020), Jacewicz and Fox (2020) and Kirby and Misnadin (2019).